You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@zookeeper.apache.org by Michael Han <ha...@apache.org> on 2021/04/21 03:03:16 UTC

Re: write performance issue in 3.6.2

What is the workload looking like? Is it pure write, or mixed read write?

A couple of ideas to move this forward:
* Publish the performance benchmark so the community can help.
* Bisect git commit and find the bad commit that caused the regression.
* Use the fine grained metrics introduced in 3.6 (e.g per processor stage
metrics) to measure where time spends during writes. We might have to add
these metrics on 3.4 to get a fair comparison.

For the throttling - the RequestThrottler introduced in 3.6 does introduce
latency, but should not impact throughput this much.

On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:

> The CPU usage of both server and client are normal (< 50%) during the test.
>
> Based on the investigation, the server is too busy with the load.
>
> The issue doesn't exist in 3.4.14. I wonder why there is a significant
> write performance degradation from 3.4.14 to 3.6.2 and how we can address
> the issue.
>
> Best,
>
> Li
>
>
>
>
>
>
>
>
>
>
>
> On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org> wrote:
>
> > What is the CPU usage of both server and client during the test?
> >
> > Looks like server is dropping the clients because either the server or
> > both are too busy to deal with the load.
> > This log line is also concerning: "Too busy to snap, skipping”
> >
> > If that’s the case I believe you'll have to profile the server process to
> > figure out where the perf bottleneck is.
> >
> > Andor
> >
> >
> >
> >
> > > On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
> > >
> > > Thanks, Patrick.
> > >
> > > Yes, we are using the same JVM version and GC configurations when
> > > running the two tests. I have checked the GC metrics and also the heap
> > dump
> > > of the 3.6, the GC pause and the memory usage look okay.
> > >
> > > Best,
> > >
> > > Li
> > >
> > > On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@apache.org> wrote:
> > >
> > >> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com> wrote:
> > >>
> > >>> Hi Enrico, Sushant,
> > >>>
> > >>> I re-run the perf test with the data consistency check feature
> disabled
> > >>> (i.e. -Dzookeeper.digest.enabled=false), the write performance issue
> of
> > >> 3.6
> > >>> is still there.
> > >>>
> > >>> With everything exactly the same, the throughput of 3.6 was only 1/2
> of
> > >> 3.4
> > >>> and the max latency was more than 8 times.
> > >>>
> > >>> Any other points or thoughts?
> > >>>
> > >>>
> > >> In the past I've noticed a big impact of GC when doing certain
> > performance
> > >> measurements. I assume you are using the same JVM version and GC when
> > >> running the two tests? Perhaps our memory footprint has expanded over
> > time.
> > >> You should rule out GC by running with gc logging turned on with both
> > >> versions and compare the impact.
> > >>
> > >> Regards,
> > >>
> > >> Patrick
> > >>
> > >>
> > >>> Cheers,
> > >>>
> > >>> Li
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com> wrote:
> > >>>
> > >>>> Thanks Sushant and Enrico!
> > >>>>
> > >>>> This is a really good point.  According to the 3.6 documentation,
> the
> > >>>> feature is disabled by default.
> > >>>>
> > >>>
> > >>
> >
> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
> > >>> .
> > >>>> However, checking the code, the default is enabled.
> > >>>>
> > >>>> Let me set the zookeeper.digest.enabled to false and see how the
> write
> > >>>> operation performs.
> > >>>>
> > >>>> Best,
> > >>>>
> > >>>> Li
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
> sushantmane7@gmail.com>
> > >>>> wrote:
> > >>>>
> > >>>>> Hi Li,
> > >>>>>
> > >>>>> On 3.6.2 consistency checker (adhash based) is enabled by default:
> > >>>>>
> > >>>>>
> > >>>
> > >>
> >
> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
> > >>>>> .
> > >>>>> It is not present in ZK 3.4.14.
> > >>>>>
> > >>>>> This feature does have some impact on write performance.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Sushant
> > >>>>>
> > >>>>>
> > >>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
> > eolivelli@gmail.com
> > >>>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Li,
> > >>>>>> I wonder of we have some new throttling/back pressure mechanisms
> > >> that
> > >>> is
> > >>>>>> enabled by default.
> > >>>>>>
> > >>>>>> Does anyone has some pointer to relevant implementations?
> > >>>>>>
> > >>>>>>
> > >>>>>> Enrico
> > >>>>>>
> > >>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha scritto:
> > >>>>>>
> > >>>>>>> Hi,
> > >>>>>>>
> > >>>>>>> We switched to Netty on both client side and server side and the
> > >>>>>>> performance issue is still there.  Anyone has any insights on
> what
> > >>>>> could
> > >>>>>> be
> > >>>>>>> the cause of higher latency?
> > >>>>>>>
> > >>>>>>> Thanks,
> > >>>>>>>
> > >>>>>>> Li
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li...@gmail.com>
> > >> wrote:
> > >>>>>>>
> > >>>>>>>> Hi Enrico,
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Thanks for the reply.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> > >>>>>>>>
> > >>>>>>>> 2. Yes, here are some metrics on the client side.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,  Max
> > >>>>> Latency
> > >>>>>>> 31s
> > >>>>>>>>
> > >>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max
> > >>> Latency:
> > >>>>>> 1.6s
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> > >>>>>>>>
> > >>>>>>>> 10G of Heap
> > >>>>>>>>
> > >>>>>>>> 13G of Memory
> > >>>>>>>>
> > >>>>>>>> 5 Participante
> > >>>>>>>>
> > >>>>>>>> 5 Observere
> > >>>>>>>>
> > >>>>>>>> Client session timeout: 3000ms
> > >>>>>>>>
> > >>>>>>>> Server min session time: 4000ms
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 4. Yes, there are two types of  WARN logs and many “Expiring
> > >>>>> session”
> > >>>>>>>> INFO log
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> > >>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected exception
> > >>>>>>>>
> > >>>>>>>> EndOfStreamException: Unable to read additional data from
> > >> client,
> > >>> it
> > >>>>>>>> probably closed the socket: address = /100.108.63.116:43366,
> > >>>>> session =
> > >>>>>>>> 0x400189fee9a000b
> > >>>>>>>>
> > >>>>>>>> at
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>
> > >>
> >
> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
> > >>>>>>>>
> > >>>>>>>> at
> > >>>>>>
> > >> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
> > >>>>>>>>
> > >>>>>>>> at
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>
> > >>
> >
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
> > >>>>>>>>
> > >>>>>>>> at
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>
> > >>
> >
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
> > >>>>>>>>
> > >>>>>>>> at
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>
> > >>
> >
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> > >>>>>>>>
> > >>>>>>>> at
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>
> > >>
> >
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> > >>>>>>>>
> > >>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> > >>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to snap,
> > >>>>> skipping
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> > >>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> > >>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
> > >> Actually,
> > >>>>> the
> > >>>>>>>> issue happened with the combinations of
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 3.4 client and 3.6 server
> > >>>>>>>>
> > >>>>>>>> 3.6 client and 3.6 server
> > >>>>>>>>
> > >>>>>>>> Please let me know if you need any additional info.
> > >>>>>>>>
> > >>>>>>>> Thanks,
> > >>>>>>>>
> > >>>>>>>> Li
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <li...@gmail.com>
> > >>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Hi Enrico,
> > >>>>>>>>>
> > >>>>>>>>> Thanks for the reply.
> > >>>>>>>>>
> > >>>>>>>>> 1. We are using direct NIO based stack, not Netty based yet.
> > >>>>>>>>> 2. Yes, on the client side, here are the metrics
> > >>>>>>>>>
> > >>>>>>>>> 3.6:
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
> > >>>>> eolivelli@gmail.com
> > >>>>>>>
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> IIRC The main difference is about the switch to Netty 4 and
> > >>> about
> > >>>>>> using
> > >>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
> > >>>>>>>>>>
> > >>>>>>>>>> Apart from that macro difference there have been many many
> > >>> changes
> > >>>>>>> since
> > >>>>>>>>>> 3.4.
> > >>>>>>>>>>
> > >>>>>>>>>> Do you have some metrics to share?
> > >>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration equals
> > >> to
> > >>>>> each
> > >>>>>>>>>> other?
> > >>>>>>>>>>
> > >>>>>>>>>> Do you see warnings on the server logs?
> > >>>>>>>>>>
> > >>>>>>>>>> Did you upgrade both the client and the server or only the
> > >>> server?
> > >>>>>>>>>>
> > >>>>>>>>>> Enrico
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com> ha
> > >>> scritto:
> > >>>>>>>>>>
> > >>>>>>>>>>> Hi,
> > >>>>>>>>>>>
> > >>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
> > >>>>> perform/load
> > >>>>>>>>>>> comparison test,  it was found that the performance of 3.6
> > >> has
> > >>>>> been
> > >>>>>>>>>>> significantly degraded compared to 3.4 for the write
> > >>> operation.
> > >>>>>> Under
> > >>>>>>>>>> the
> > >>>>>>>>>>> same load, there was a huge number of SessionExpired and
> > >>>>>>> ConnectionLoss
> > >>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> > >>>>>>>>>>>
> > >>>>>>>>>>> The load testing is 500 concurrent users with a cluster of 5
> > >>>>>>>>>> participants
> > >>>>>>>>>>> and 5 observers. The min session timeout on the server side
> > >> is
> > >>>>>>> 4000ms.
> > >>>>>>>>>>>
> > >>>>>>>>>>> I wonder if anyone has seen the same issue and has any
> > >>> insights
> > >>>>> on
> > >>>>>>> what
> > >>>>>>>>>>> could be the cause of the performance degradation.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Thanks
> > >>>>>>>>>>>
> > >>>>>>>>>>> Li
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> >
>

Re: write performance issue in 3.6.2

Posted by Li Wang <li...@gmail.com>.

Hi Srikant,

1. Have you tried to run the test without enabling Prometheus metrics? What
I observed that enabling Prometheus has significant performance impact
(about 40%-60% degradation)
2. In addition to the session expiry errors and max latency increasing
issue, did you see any issue with throughput?
3. Request throttling is disabled by default, however all the requests go
through RequestThrottler. Here is the code.

private static volatile int maxRequests =
Integer.getInteger("zookeeper.request_throttle_max_requests", 0);

4. What are the request throttling logs you've seen?

Best,

Li





On Tue, Apr 20, 2021 at 9:06 PM shrikant kalani <sh...@gmail.com>
wrote:

> Hello Everyone,
>
> We are also using zookeeper 3.6.2 with ssl turned on both sides. We
> observed the same behaviour where under high write load the ZK server
> starts expiring the session. There are no jvm related issues. During high
> load the max latency increases significantly.
>
> Also the session expiration message is not accurate. We do have session
> expiration set to 40 sec but ZK server disconnects the client within 10
> sec.
>
> Also the logs prints throttling the request but ZK documentation says
> throttling is disabled by default. Can someone check the code once to see
> if it is enabled or disabled. I am not a developer and hence not familiar
> with java code.
>
> Thanks
> Srikant Kalani
>
> On Wed, 21 Apr 2021 at 11:03 AM, Michael Han <ha...@apache.org> wrote:
>
> > What is the workload looking like? Is it pure write, or mixed read write?
> >
> > A couple of ideas to move this forward:
> > * Publish the performance benchmark so the community can help.
> > * Bisect git commit and find the bad commit that caused the regression.
> > * Use the fine grained metrics introduced in 3.6 (e.g per processor stage
> > metrics) to measure where time spends during writes. We might have to add
> > these metrics on 3.4 to get a fair comparison.
> >
> > For the throttling - the RequestThrottler introduced in 3.6 does
> introduce
> > latency, but should not impact throughput this much.
> >
> > On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:
> >
> > > The CPU usage of both server and client are normal (< 50%) during the
> > test.
> > >
> > > Based on the investigation, the server is too busy with the load.
> > >
> > > The issue doesn't exist in 3.4.14. I wonder why there is a significant
> > > write performance degradation from 3.4.14 to 3.6.2 and how we can
> address
> > > the issue.
> > >
> > > Best,
> > >
> > > Li
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org>
> wrote:
> > >
> > > > What is the CPU usage of both server and client during the test?
> > > >
> > > > Looks like server is dropping the clients because either the server
> or
> > > > both are too busy to deal with the load.
> > > > This log line is also concerning: "Too busy to snap, skipping”
> > > >
> > > > If that’s the case I believe you'll have to profile the server
> process
> > to
> > > > figure out where the perf bottleneck is.
> > > >
> > > > Andor
> > > >
> > > >
> > > >
> > > >
> > > > > On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
> > > > >
> > > > > Thanks, Patrick.
> > > > >
> > > > > Yes, we are using the same JVM version and GC configurations when
> > > > > running the two tests. I have checked the GC metrics and also the
> > heap
> > > > dump
> > > > > of the 3.6, the GC pause and the memory usage look okay.
> > > > >
> > > > > Best,
> > > > >
> > > > > Li
> > > > >
> > > > > On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@apache.org>
> > wrote:
> > > > >
> > > > >> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com>
> wrote:
> > > > >>
> > > > >>> Hi Enrico, Sushant,
> > > > >>>
> > > > >>> I re-run the perf test with the data consistency check feature
> > > disabled
> > > > >>> (i.e. -Dzookeeper.digest.enabled=false), the write performance
> > issue
> > > of
> > > > >> 3.6
> > > > >>> is still there.
> > > > >>>
> > > > >>> With everything exactly the same, the throughput of 3.6 was only
> > 1/2
> > > of
> > > > >> 3.4
> > > > >>> and the max latency was more than 8 times.
> > > > >>>
> > > > >>> Any other points or thoughts?
> > > > >>>
> > > > >>>
> > > > >> In the past I've noticed a big impact of GC when doing certain
> > > > performance
> > > > >> measurements. I assume you are using the same JVM version and GC
> > when
> > > > >> running the two tests? Perhaps our memory footprint has expanded
> > over
> > > > time.
> > > > >> You should rule out GC by running with gc logging turned on with
> > both
> > > > >> versions and compare the impact.
> > > > >>
> > > > >> Regards,
> > > > >>
> > > > >> Patrick
> > > > >>
> > > > >>
> > > > >>> Cheers,
> > > > >>>
> > > > >>> Li
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com>
> wrote:
> > > > >>>
> > > > >>>> Thanks Sushant and Enrico!
> > > > >>>>
> > > > >>>> This is a really good point.  According to the 3.6
> documentation,
> > > the
> > > > >>>> feature is disabled by default.
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
> > > > >>> .
> > > > >>>> However, checking the code, the default is enabled.
> > > > >>>>
> > > > >>>> Let me set the zookeeper.digest.enabled to false and see how the
> > > write
> > > > >>>> operation performs.
> > > > >>>>
> > > > >>>> Best,
> > > > >>>>
> > > > >>>> Li
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
> > > sushantmane7@gmail.com>
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>> Hi Li,
> > > > >>>>>
> > > > >>>>> On 3.6.2 consistency checker (adhash based) is enabled by
> > default:
> > > > >>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
> > > > >>>>> .
> > > > >>>>> It is not present in ZK 3.4.14.
> > > > >>>>>
> > > > >>>>> This feature does have some impact on write performance.
> > > > >>>>>
> > > > >>>>> Thanks,
> > > > >>>>> Sushant
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
> > > > eolivelli@gmail.com
> > > > >>>
> > > > >>>>> wrote:
> > > > >>>>>
> > > > >>>>>> Li,
> > > > >>>>>> I wonder of we have some new throttling/back pressure
> mechanisms
> > > > >> that
> > > > >>> is
> > > > >>>>>> enabled by default.
> > > > >>>>>>
> > > > >>>>>> Does anyone has some pointer to relevant implementations?
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> Enrico
> > > > >>>>>>
> > > > >>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha
> > scritto:
> > > > >>>>>>
> > > > >>>>>>> Hi,
> > > > >>>>>>>
> > > > >>>>>>> We switched to Netty on both client side and server side and
> > the
> > > > >>>>>>> performance issue is still there.  Anyone has any insights on
> > > what
> > > > >>>>> could
> > > > >>>>>> be
> > > > >>>>>>> the cause of higher latency?
> > > > >>>>>>>
> > > > >>>>>>> Thanks,
> > > > >>>>>>>
> > > > >>>>>>> Li
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li...@gmail.com>
> > > > >> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> Hi Enrico,
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> Thanks for the reply.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> > > > >>>>>>>>
> > > > >>>>>>>> 2. Yes, here are some metrics on the client side.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,
> > Max
> > > > >>>>> Latency
> > > > >>>>>>> 31s
> > > > >>>>>>>>
> > > > >>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max
> > > > >>> Latency:
> > > > >>>>>> 1.6s
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> > > > >>>>>>>>
> > > > >>>>>>>> 10G of Heap
> > > > >>>>>>>>
> > > > >>>>>>>> 13G of Memory
> > > > >>>>>>>>
> > > > >>>>>>>> 5 Participante
> > > > >>>>>>>>
> > > > >>>>>>>> 5 Observere
> > > > >>>>>>>>
> > > > >>>>>>>> Client session timeout: 3000ms
> > > > >>>>>>>>
> > > > >>>>>>>> Server min session time: 4000ms
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 4. Yes, there are two types of  WARN logs and many “Expiring
> > > > >>>>> session”
> > > > >>>>>>>> INFO log
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> > > > >>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected
> exception
> > > > >>>>>>>>
> > > > >>>>>>>> EndOfStreamException: Unable to read additional data from
> > > > >> client,
> > > > >>> it
> > > > >>>>>>>> probably closed the socket: address = /100.108.63.116:43366
> ,
> > > > >>>>> session =
> > > > >>>>>>>> 0x400189fee9a000b
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>
> > > > >>
> > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> > > > >>>>>>>>
> > > > >>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> > > > >>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to snap,
> > > > >>>>> skipping
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> > > > >>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> > > > >>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
> > > > >> Actually,
> > > > >>>>> the
> > > > >>>>>>>> issue happened with the combinations of
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3.4 client and 3.6 server
> > > > >>>>>>>>
> > > > >>>>>>>> 3.6 client and 3.6 server
> > > > >>>>>>>>
> > > > >>>>>>>> Please let me know if you need any additional info.
> > > > >>>>>>>>
> > > > >>>>>>>> Thanks,
> > > > >>>>>>>>
> > > > >>>>>>>> Li
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <li...@gmail.com>
> > > > >>> wrote:
> > > > >>>>>>>>
> > > > >>>>>>>>> Hi Enrico,
> > > > >>>>>>>>>
> > > > >>>>>>>>> Thanks for the reply.
> > > > >>>>>>>>>
> > > > >>>>>>>>> 1. We are using direct NIO based stack, not Netty based
> yet.
> > > > >>>>>>>>> 2. Yes, on the client side, here are the metrics
> > > > >>>>>>>>>
> > > > >>>>>>>>> 3.6:
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
> > > > >>>>> eolivelli@gmail.com
> > > > >>>>>>>
> > > > >>>>>>>>> wrote:
> > > > >>>>>>>>>
> > > > >>>>>>>>>> IIRC The main difference is about the switch to Netty 4
> and
> > > > >>> about
> > > > >>>>>> using
> > > > >>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Apart from that macro difference there have been many many
> > > > >>> changes
> > > > >>>>>>> since
> > > > >>>>>>>>>> 3.4.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Do you have some metrics to share?
> > > > >>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration
> equals
> > > > >> to
> > > > >>>>> each
> > > > >>>>>>>>>> other?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Do you see warnings on the server logs?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Did you upgrade both the client and the server or only the
> > > > >>> server?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Enrico
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com> ha
> > > > >>> scritto:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> Hi,
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
> > > > >>>>> perform/load
> > > > >>>>>>>>>>> comparison test,  it was found that the performance of
> 3.6
> > > > >> has
> > > > >>>>> been
> > > > >>>>>>>>>>> significantly degraded compared to 3.4 for the write
> > > > >>> operation.
> > > > >>>>>> Under
> > > > >>>>>>>>>> the
> > > > >>>>>>>>>>> same load, there was a huge number of SessionExpired and
> > > > >>>>>>> ConnectionLoss
> > > > >>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> The load testing is 500 concurrent users with a cluster
> of
> > 5
> > > > >>>>>>>>>> participants
> > > > >>>>>>>>>>> and 5 observers. The min session timeout on the server
> side
> > > > >> is
> > > > >>>>>>> 4000ms.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> I wonder if anyone has seen the same issue and has any
> > > > >>> insights
> > > > >>>>> on
> > > > >>>>>>> what
> > > > >>>>>>>>>>> could be the cause of the performance degradation.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Thanks
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Li
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > > >
> > >
> >
>

Re: write performance issue in 3.6.2

Posted by shrikant kalani <sh...@gmail.com>.

Hi Andor

Thanks for your reply.

We are planning to perform one more round of stress testing and then I
would be able to provide the details logs needed for any troubleshooting.
Other details are provided against each question.


- which version of Zookeeper is being used,
3.6.2 at server side and 3.6.1 at client side

- how many nodes are you running in the ZK cluster,
3 nodes cluster
- what is the server configuration? any custom setting is in place?
Server runs with standard configuration. We have 30G memory allocated to
each jvm. The number of znodes in cluster at anytime ranges between
2million to 4 million.

- what is the hardware and software setup? on-prem or cloud? instance type?
CPU, memory, disk properties, operating system, etc.

It’s on prem, running on Rhel 7. The bare metal host has 48 cores and 378G
memory shared among different services. We are using SSD drives.

- network characteristics
Can you provide more details what I should provide here ?

- how many clients are connected and what are they doing? share the
relevant source code of your client or the command that you’re running,
Around 120 client connections on each node.

- 3.6 has advanced monitoring capabilities, setup Prometheus and share
screenshots of relevant metrics
We have prometheus and Grafana up n running. Any specific metric we should
be looking for ? So far what we have noticed is latency spikes up when we
see the issue.

- server and client logs, debug enabled if possible,
Will try to provide from our next testing.

- security settings: TLS, Kerberos, etc.
TLS enabled in quorum as well as for client connections.

- ...anything else which could be important

Thanks
Srikant Kalani

On Fri, 23 Apr 2021 at 5:25 PM, Andor Molnar <an...@apache.org> wrote:

> Hi folks,
>
> As previously mentioned the community won’t be able to help if you don’t
> share more information about your scenario. We need to see the following:
>
> - which version of Zookeeper is being used,
> - how many nodes are you running in the ZK cluster,
> - what is the server configuration? any custom setting is in place?
> - what is the hardware and software setup? on-prem or cloud? instance
> type? CPU, memory, disk properties, operating system, etc.
> - network characteristics
> - how many clients are connected and what are they doing? share the
> relevant source code of your client or the command that you’re running,
> - 3.6 has advanced monitoring capabilities, setup Prometheus and share
> screenshots of relevant metrics
> - server and client logs, debug enabled if possible,
> - security settings: TLS, Kerberos, etc.
> - ...anything else which could be important
>
> In a nutshell, either you have to share information about your production
> system or provide a reproduction setup. Performance issues are pretty hard
> to resolve, because of the so many moving parts. The community is willing
> to help, but you need to share information to be successful.
>
> shrikant,
> ZK 3.6 has throttling for both client connections and requests. Request
> throttling can be disabled and it’s disabled by default, but connection
> throttling is not. From the log messages we can tell which throttling is in
> effect for your scenario.
>
> Regards,
> Andor
>
>
>
> > On 2021. Apr 21., at 5:25, shrikant kalani <sh...@gmail.com>
> wrote:
> >
> > Hello Everyone,
> >
> > We are also using zookeeper 3.6.2 with ssl turned on both sides. We
> > observed the same behaviour where under high write load the ZK server
> > starts expiring the session. There are no jvm related issues. During high
> > load the max latency increases significantly.
> >
> > Also the session expiration message is not accurate. We do have session
> > expiration set to 40 sec but ZK server disconnects the client within 10
> sec.
> >
> > Also the logs prints throttling the request but ZK documentation says
> > throttling is disabled by default. Can someone check the code once to see
> > if it is enabled or disabled. I am not a developer and hence not familiar
> > with java code.
> >
> > Thanks
> > Srikant Kalani
> >
> > On Wed, 21 Apr 2021 at 11:03 AM, Michael Han <ha...@apache.org> wrote:
> >
> >> What is the workload looking like? Is it pure write, or mixed read
> write?
> >>
> >> A couple of ideas to move this forward:
> >> * Publish the performance benchmark so the community can help.
> >> * Bisect git commit and find the bad commit that caused the regression.
> >> * Use the fine grained metrics introduced in 3.6 (e.g per processor
> stage
> >> metrics) to measure where time spends during writes. We might have to
> add
> >> these metrics on 3.4 to get a fair comparison.
> >>
> >> For the throttling - the RequestThrottler introduced in 3.6 does
> introduce
> >> latency, but should not impact throughput this much.
> >>
> >> On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:
> >>
> >>> The CPU usage of both server and client are normal (< 50%) during the
> >> test.
> >>>
> >>> Based on the investigation, the server is too busy with the load.
> >>>
> >>> The issue doesn't exist in 3.4.14. I wonder why there is a significant
> >>> write performance degradation from 3.4.14 to 3.6.2 and how we can
> address
> >>> the issue.
> >>>
> >>> Best,
> >>>
> >>> Li
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org>
> wrote:
> >>>
> >>>> What is the CPU usage of both server and client during the test?
> >>>>
> >>>> Looks like server is dropping the clients because either the server or
> >>>> both are too busy to deal with the load.
> >>>> This log line is also concerning: "Too busy to snap, skipping”
> >>>>
> >>>> If that’s the case I believe you'll have to profile the server process
> >> to
> >>>> figure out where the perf bottleneck is.
> >>>>
> >>>> Andor
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
> >>>>>
> >>>>> Thanks, Patrick.
> >>>>>
> >>>>> Yes, we are using the same JVM version and GC configurations when
> >>>>> running the two tests. I have checked the GC metrics and also the
> >> heap
> >>>> dump
> >>>>> of the 3.6, the GC pause and the memory usage look okay.
> >>>>>
> >>>>> Best,
> >>>>>
> >>>>> Li
> >>>>>
> >>>>> On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@apache.org>
> >> wrote:
> >>>>>
> >>>>>> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com> wrote:
> >>>>>>
> >>>>>>> Hi Enrico, Sushant,
> >>>>>>>
> >>>>>>> I re-run the perf test with the data consistency check feature
> >>> disabled
> >>>>>>> (i.e. -Dzookeeper.digest.enabled=false), the write performance
> >> issue
> >>> of
> >>>>>> 3.6
> >>>>>>> is still there.
> >>>>>>>
> >>>>>>> With everything exactly the same, the throughput of 3.6 was only
> >> 1/2
> >>> of
> >>>>>> 3.4
> >>>>>>> and the max latency was more than 8 times.
> >>>>>>>
> >>>>>>> Any other points or thoughts?
> >>>>>>>
> >>>>>>>
> >>>>>> In the past I've noticed a big impact of GC when doing certain
> >>>> performance
> >>>>>> measurements. I assume you are using the same JVM version and GC
> >> when
> >>>>>> running the two tests? Perhaps our memory footprint has expanded
> >> over
> >>>> time.
> >>>>>> You should rule out GC by running with gc logging turned on with
> >> both
> >>>>>> versions and compare the impact.
> >>>>>>
> >>>>>> Regards,
> >>>>>>
> >>>>>> Patrick
> >>>>>>
> >>>>>>
> >>>>>>> Cheers,
> >>>>>>>
> >>>>>>> Li
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com> wrote:
> >>>>>>>
> >>>>>>>> Thanks Sushant and Enrico!
> >>>>>>>>
> >>>>>>>> This is a really good point.  According to the 3.6 documentation,
> >>> the
> >>>>>>>> feature is disabled by default.
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
> >>>>>>> .
> >>>>>>>> However, checking the code, the default is enabled.
> >>>>>>>>
> >>>>>>>> Let me set the zookeeper.digest.enabled to false and see how the
> >>> write
> >>>>>>>> operation performs.
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>>
> >>>>>>>> Li
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
> >>> sushantmane7@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi Li,
> >>>>>>>>>
> >>>>>>>>> On 3.6.2 consistency checker (adhash based) is enabled by
> >> default:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
> >>>>>>>>> .
> >>>>>>>>> It is not present in ZK 3.4.14.
> >>>>>>>>>
> >>>>>>>>> This feature does have some impact on write performance.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Sushant
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
> >>>> eolivelli@gmail.com
> >>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Li,
> >>>>>>>>>> I wonder of we have some new throttling/back pressure mechanisms
> >>>>>> that
> >>>>>>> is
> >>>>>>>>>> enabled by default.
> >>>>>>>>>>
> >>>>>>>>>> Does anyone has some pointer to relevant implementations?
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Enrico
> >>>>>>>>>>
> >>>>>>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha
> >> scritto:
> >>>>>>>>>>
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>> We switched to Netty on both client side and server side and
> >> the
> >>>>>>>>>>> performance issue is still there.  Anyone has any insights on
> >>> what
> >>>>>>>>> could
> >>>>>>>>>> be
> >>>>>>>>>>> the cause of higher latency?
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>>
> >>>>>>>>>>> Li
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li...@gmail.com>
> >>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Enrico,
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks for the reply.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2. Yes, here are some metrics on the client side.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,
> >> Max
> >>>>>>>>> Latency
> >>>>>>>>>>> 31s
> >>>>>>>>>>>>
> >>>>>>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max
> >>>>>>> Latency:
> >>>>>>>>>> 1.6s
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> >>>>>>>>>>>>
> >>>>>>>>>>>> 10G of Heap
> >>>>>>>>>>>>
> >>>>>>>>>>>> 13G of Memory
> >>>>>>>>>>>>
> >>>>>>>>>>>> 5 Participante
> >>>>>>>>>>>>
> >>>>>>>>>>>> 5 Observere
> >>>>>>>>>>>>
> >>>>>>>>>>>> Client session timeout: 3000ms
> >>>>>>>>>>>>
> >>>>>>>>>>>> Server min session time: 4000ms
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 4. Yes, there are two types of  WARN logs and many “Expiring
> >>>>>>>>> session”
> >>>>>>>>>>>> INFO log
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> >>>>>>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected exception
> >>>>>>>>>>>>
> >>>>>>>>>>>> EndOfStreamException: Unable to read additional data from
> >>>>>> client,
> >>>>>>> it
> >>>>>>>>>>>> probably closed the socket: address = /100.108.63.116:43366,
> >>>>>>>>> session =
> >>>>>>>>>>>> 0x400189fee9a000b
> >>>>>>>>>>>>
> >>>>>>>>>>>> at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
> >>>>>>>>>>>>
> >>>>>>>>>>>> at
> >>>>>>>>>>
> >>>>>>
> >> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
> >>>>>>>>>>>>
> >>>>>>>>>>>> at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
> >>>>>>>>>>>>
> >>>>>>>>>>>> at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
> >>>>>>>>>>>>
> >>>>>>>>>>>> at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> >>>>>>>>>>>>
> >>>>>>>>>>>> at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> >>>>>>>>>>>>
> >>>>>>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> >>>>>>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to snap,
> >>>>>>>>> skipping
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> >>>>>>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> >>>>>>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
> >>>>>> Actually,
> >>>>>>>>> the
> >>>>>>>>>>>> issue happened with the combinations of
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 3.4 client and 3.6 server
> >>>>>>>>>>>>
> >>>>>>>>>>>> 3.6 client and 3.6 server
> >>>>>>>>>>>>
> >>>>>>>>>>>> Please let me know if you need any additional info.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Li
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <li...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Enrico,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks for the reply.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 1. We are using direct NIO based stack, not Netty based yet.
> >>>>>>>>>>>>> 2. Yes, on the client side, here are the metrics
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 3.6:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
> >>>>>>>>> eolivelli@gmail.com
> >>>>>>>>>>>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> IIRC The main difference is about the switch to Netty 4 and
> >>>>>>> about
> >>>>>>>>>> using
> >>>>>>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Apart from that macro difference there have been many many
> >>>>>>> changes
> >>>>>>>>>>> since
> >>>>>>>>>>>>>> 3.4.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Do you have some metrics to share?
> >>>>>>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration equals
> >>>>>> to
> >>>>>>>>> each
> >>>>>>>>>>>>>> other?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Do you see warnings on the server logs?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Did you upgrade both the client and the server or only the
> >>>>>>> server?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Enrico
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com> ha
> >>>>>>> scritto:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
> >>>>>>>>> perform/load
> >>>>>>>>>>>>>>> comparison test,  it was found that the performance of 3.6
> >>>>>> has
> >>>>>>>>> been
> >>>>>>>>>>>>>>> significantly degraded compared to 3.4 for the write
> >>>>>>> operation.
> >>>>>>>>>> Under
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>> same load, there was a huge number of SessionExpired and
> >>>>>>>>>>> ConnectionLoss
> >>>>>>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The load testing is 500 concurrent users with a cluster of
> >> 5
> >>>>>>>>>>>>>> participants
> >>>>>>>>>>>>>>> and 5 observers. The min session timeout on the server side
> >>>>>> is
> >>>>>>>>>>> 4000ms.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I wonder if anyone has seen the same issue and has any
> >>>>>>> insights
> >>>>>>>>> on
> >>>>>>>>>>> what
> >>>>>>>>>>>>>>> could be the cause of the performance degradation.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Li
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>
> >>
>
>

Re: write performance issue in 3.6.2

Posted by shrikant kalani <sh...@gmail.com>.

Hi Andor

Thanks for your reply.

We are planning to perform one more round of stress testing and then I
would be able to provide the details logs needed for any troubleshooting.
Other details are provided against each question.


- which version of Zookeeper is being used,
3.6.2 at server side and 3.6.1 at client side

- how many nodes are you running in the ZK cluster,
3 nodes cluster
- what is the server configuration? any custom setting is in place?
Server runs with standard configuration. We have 30G memory allocated to
each jvm. The number of znodes in cluster at anytime ranges between
2million to 4 million.

- what is the hardware and software setup? on-prem or cloud? instance type?
CPU, memory, disk properties, operating system, etc.

It’s on prem, running on Rhel 7. The bare metal host has 48 cores and 378G
memory shared among different services. We are using SSD drives.

- network characteristics
Can you provide more details what I should provide here ?

- how many clients are connected and what are they doing? share the
relevant source code of your client or the command that you’re running,
Around 120 client connections on each node.

- 3.6 has advanced monitoring capabilities, setup Prometheus and share
screenshots of relevant metrics
We have prometheus and Grafana up n running. Any specific metric we should
be looking for ? So far what we have noticed is latency spikes up when we
see the issue.

- server and client logs, debug enabled if possible,
Will try to provide from our next testing.

- security settings: TLS, Kerberos, etc.
TLS enabled in quorum as well as for client connections.

- ...anything else which could be important

Thanks
Srikant Kalani

On Fri, 23 Apr 2021 at 5:25 PM, Andor Molnar <an...@apache.org> wrote:

> Hi folks,
>
> As previously mentioned the community won’t be able to help if you don’t
> share more information about your scenario. We need to see the following:
>
> - which version of Zookeeper is being used,
> - how many nodes are you running in the ZK cluster,
> - what is the server configuration? any custom setting is in place?
> - what is the hardware and software setup? on-prem or cloud? instance
> type? CPU, memory, disk properties, operating system, etc.
> - network characteristics
> - how many clients are connected and what are they doing? share the
> relevant source code of your client or the command that you’re running,
> - 3.6 has advanced monitoring capabilities, setup Prometheus and share
> screenshots of relevant metrics
> - server and client logs, debug enabled if possible,
> - security settings: TLS, Kerberos, etc.
> - ...anything else which could be important
>
> In a nutshell, either you have to share information about your production
> system or provide a reproduction setup. Performance issues are pretty hard
> to resolve, because of the so many moving parts. The community is willing
> to help, but you need to share information to be successful.
>
> shrikant,
> ZK 3.6 has throttling for both client connections and requests. Request
> throttling can be disabled and it’s disabled by default, but connection
> throttling is not. From the log messages we can tell which throttling is in
> effect for your scenario.
>
> Regards,
> Andor
>
>
>
> > On 2021. Apr 21., at 5:25, shrikant kalani <sh...@gmail.com>
> wrote:
> >
> > Hello Everyone,
> >
> > We are also using zookeeper 3.6.2 with ssl turned on both sides. We
> > observed the same behaviour where under high write load the ZK server
> > starts expiring the session. There are no jvm related issues. During high
> > load the max latency increases significantly.
> >
> > Also the session expiration message is not accurate. We do have session
> > expiration set to 40 sec but ZK server disconnects the client within 10
> sec.
> >
> > Also the logs prints throttling the request but ZK documentation says
> > throttling is disabled by default. Can someone check the code once to see
> > if it is enabled or disabled. I am not a developer and hence not familiar
> > with java code.
> >
> > Thanks
> > Srikant Kalani
> >
> > On Wed, 21 Apr 2021 at 11:03 AM, Michael Han <ha...@apache.org> wrote:
> >
> >> What is the workload looking like? Is it pure write, or mixed read
> write?
> >>
> >> A couple of ideas to move this forward:
> >> * Publish the performance benchmark so the community can help.
> >> * Bisect git commit and find the bad commit that caused the regression.
> >> * Use the fine grained metrics introduced in 3.6 (e.g per processor
> stage
> >> metrics) to measure where time spends during writes. We might have to
> add
> >> these metrics on 3.4 to get a fair comparison.
> >>
> >> For the throttling - the RequestThrottler introduced in 3.6 does
> introduce
> >> latency, but should not impact throughput this much.
> >>
> >> On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:
> >>
> >>> The CPU usage of both server and client are normal (< 50%) during the
> >> test.
> >>>
> >>> Based on the investigation, the server is too busy with the load.
> >>>
> >>> The issue doesn't exist in 3.4.14. I wonder why there is a significant
> >>> write performance degradation from 3.4.14 to 3.6.2 and how we can
> address
> >>> the issue.
> >>>
> >>> Best,
> >>>
> >>> Li
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org>
> wrote:
> >>>
> >>>> What is the CPU usage of both server and client during the test?
> >>>>
> >>>> Looks like server is dropping the clients because either the server or
> >>>> both are too busy to deal with the load.
> >>>> This log line is also concerning: "Too busy to snap, skipping”
> >>>>
> >>>> If that’s the case I believe you'll have to profile the server process
> >> to
> >>>> figure out where the perf bottleneck is.
> >>>>
> >>>> Andor
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
> >>>>>
> >>>>> Thanks, Patrick.
> >>>>>
> >>>>> Yes, we are using the same JVM version and GC configurations when
> >>>>> running the two tests. I have checked the GC metrics and also the
> >> heap
> >>>> dump
> >>>>> of the 3.6, the GC pause and the memory usage look okay.
> >>>>>
> >>>>> Best,
> >>>>>
> >>>>> Li
> >>>>>
> >>>>> On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@apache.org>
> >> wrote:
> >>>>>
> >>>>>> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com> wrote:
> >>>>>>
> >>>>>>> Hi Enrico, Sushant,
> >>>>>>>
> >>>>>>> I re-run the perf test with the data consistency check feature
> >>> disabled
> >>>>>>> (i.e. -Dzookeeper.digest.enabled=false), the write performance
> >> issue
> >>> of
> >>>>>> 3.6
> >>>>>>> is still there.
> >>>>>>>
> >>>>>>> With everything exactly the same, the throughput of 3.6 was only
> >> 1/2
> >>> of
> >>>>>> 3.4
> >>>>>>> and the max latency was more than 8 times.
> >>>>>>>
> >>>>>>> Any other points or thoughts?
> >>>>>>>
> >>>>>>>
> >>>>>> In the past I've noticed a big impact of GC when doing certain
> >>>> performance
> >>>>>> measurements. I assume you are using the same JVM version and GC
> >> when
> >>>>>> running the two tests? Perhaps our memory footprint has expanded
> >> over
> >>>> time.
> >>>>>> You should rule out GC by running with gc logging turned on with
> >> both
> >>>>>> versions and compare the impact.
> >>>>>>
> >>>>>> Regards,
> >>>>>>
> >>>>>> Patrick
> >>>>>>
> >>>>>>
> >>>>>>> Cheers,
> >>>>>>>
> >>>>>>> Li
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com> wrote:
> >>>>>>>
> >>>>>>>> Thanks Sushant and Enrico!
> >>>>>>>>
> >>>>>>>> This is a really good point.  According to the 3.6 documentation,
> >>> the
> >>>>>>>> feature is disabled by default.
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
> >>>>>>> .
> >>>>>>>> However, checking the code, the default is enabled.
> >>>>>>>>
> >>>>>>>> Let me set the zookeeper.digest.enabled to false and see how the
> >>> write
> >>>>>>>> operation performs.
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>>
> >>>>>>>> Li
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
> >>> sushantmane7@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi Li,
> >>>>>>>>>
> >>>>>>>>> On 3.6.2 consistency checker (adhash based) is enabled by
> >> default:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
> >>>>>>>>> .
> >>>>>>>>> It is not present in ZK 3.4.14.
> >>>>>>>>>
> >>>>>>>>> This feature does have some impact on write performance.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Sushant
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
> >>>> eolivelli@gmail.com
> >>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Li,
> >>>>>>>>>> I wonder of we have some new throttling/back pressure mechanisms
> >>>>>> that
> >>>>>>> is
> >>>>>>>>>> enabled by default.
> >>>>>>>>>>
> >>>>>>>>>> Does anyone has some pointer to relevant implementations?
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Enrico
> >>>>>>>>>>
> >>>>>>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha
> >> scritto:
> >>>>>>>>>>
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>> We switched to Netty on both client side and server side and
> >> the
> >>>>>>>>>>> performance issue is still there.  Anyone has any insights on
> >>> what
> >>>>>>>>> could
> >>>>>>>>>> be
> >>>>>>>>>>> the cause of higher latency?
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>>
> >>>>>>>>>>> Li
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li...@gmail.com>
> >>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Enrico,
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks for the reply.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2. Yes, here are some metrics on the client side.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,
> >> Max
> >>>>>>>>> Latency
> >>>>>>>>>>> 31s
> >>>>>>>>>>>>
> >>>>>>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max
> >>>>>>> Latency:
> >>>>>>>>>> 1.6s
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> >>>>>>>>>>>>
> >>>>>>>>>>>> 10G of Heap
> >>>>>>>>>>>>
> >>>>>>>>>>>> 13G of Memory
> >>>>>>>>>>>>
> >>>>>>>>>>>> 5 Participante
> >>>>>>>>>>>>
> >>>>>>>>>>>> 5 Observere
> >>>>>>>>>>>>
> >>>>>>>>>>>> Client session timeout: 3000ms
> >>>>>>>>>>>>
> >>>>>>>>>>>> Server min session time: 4000ms
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 4. Yes, there are two types of  WARN logs and many “Expiring
> >>>>>>>>> session”
> >>>>>>>>>>>> INFO log
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> >>>>>>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected exception
> >>>>>>>>>>>>
> >>>>>>>>>>>> EndOfStreamException: Unable to read additional data from
> >>>>>> client,
> >>>>>>> it
> >>>>>>>>>>>> probably closed the socket: address = /100.108.63.116:43366,
> >>>>>>>>> session =
> >>>>>>>>>>>> 0x400189fee9a000b
> >>>>>>>>>>>>
> >>>>>>>>>>>> at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
> >>>>>>>>>>>>
> >>>>>>>>>>>> at
> >>>>>>>>>>
> >>>>>>
> >> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
> >>>>>>>>>>>>
> >>>>>>>>>>>> at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
> >>>>>>>>>>>>
> >>>>>>>>>>>> at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
> >>>>>>>>>>>>
> >>>>>>>>>>>> at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> >>>>>>>>>>>>
> >>>>>>>>>>>> at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> >>>>>>>>>>>>
> >>>>>>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> >>>>>>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to snap,
> >>>>>>>>> skipping
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> >>>>>>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> >>>>>>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
> >>>>>> Actually,
> >>>>>>>>> the
> >>>>>>>>>>>> issue happened with the combinations of
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 3.4 client and 3.6 server
> >>>>>>>>>>>>
> >>>>>>>>>>>> 3.6 client and 3.6 server
> >>>>>>>>>>>>
> >>>>>>>>>>>> Please let me know if you need any additional info.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Li
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <li...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Enrico,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks for the reply.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 1. We are using direct NIO based stack, not Netty based yet.
> >>>>>>>>>>>>> 2. Yes, on the client side, here are the metrics
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 3.6:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
> >>>>>>>>> eolivelli@gmail.com
> >>>>>>>>>>>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> IIRC The main difference is about the switch to Netty 4 and
> >>>>>>> about
> >>>>>>>>>> using
> >>>>>>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Apart from that macro difference there have been many many
> >>>>>>> changes
> >>>>>>>>>>> since
> >>>>>>>>>>>>>> 3.4.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Do you have some metrics to share?
> >>>>>>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration equals
> >>>>>> to
> >>>>>>>>> each
> >>>>>>>>>>>>>> other?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Do you see warnings on the server logs?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Did you upgrade both the client and the server or only the
> >>>>>>> server?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Enrico
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com> ha
> >>>>>>> scritto:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
> >>>>>>>>> perform/load
> >>>>>>>>>>>>>>> comparison test,  it was found that the performance of 3.6
> >>>>>> has
> >>>>>>>>> been
> >>>>>>>>>>>>>>> significantly degraded compared to 3.4 for the write
> >>>>>>> operation.
> >>>>>>>>>> Under
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>> same load, there was a huge number of SessionExpired and
> >>>>>>>>>>> ConnectionLoss
> >>>>>>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The load testing is 500 concurrent users with a cluster of
> >> 5
> >>>>>>>>>>>>>> participants
> >>>>>>>>>>>>>>> and 5 observers. The min session timeout on the server side
> >>>>>> is
> >>>>>>>>>>> 4000ms.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I wonder if anyone has seen the same issue and has any
> >>>>>>> insights
> >>>>>>>>> on
> >>>>>>>>>>> what
> >>>>>>>>>>>>>>> could be the cause of the performance degradation.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Li
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>
> >>
>
>

Re: write performance issue in 3.6.2

Posted by Andor Molnar <an...@apache.org>.

Hi folks,

As previously mentioned the community won’t be able to help if you don’t share more information about your scenario. We need to see the following:

- which version of Zookeeper is being used,
- how many nodes are you running in the ZK cluster,
- what is the server configuration? any custom setting is in place?
- what is the hardware and software setup? on-prem or cloud? instance type? CPU, memory, disk properties, operating system, etc.
- network characteristics
- how many clients are connected and what are they doing? share the relevant source code of your client or the command that you’re running,
- 3.6 has advanced monitoring capabilities, setup Prometheus and share screenshots of relevant metrics
- server and client logs, debug enabled if possible,
- security settings: TLS, Kerberos, etc.
- ...anything else which could be important

In a nutshell, either you have to share information about your production system or provide a reproduction setup. Performance issues are pretty hard to resolve, because of the so many moving parts. The community is willing to help, but you need to share information to be successful.

shrikant,
ZK 3.6 has throttling for both client connections and requests. Request throttling can be disabled and it’s disabled by default, but connection throttling is not. From the log messages we can tell which throttling is in effect for your scenario.

Regards,
Andor



> On 2021. Apr 21., at 5:25, shrikant kalani <sh...@gmail.com> wrote:
> 
> Hello Everyone,
> 
> We are also using zookeeper 3.6.2 with ssl turned on both sides. We
> observed the same behaviour where under high write load the ZK server
> starts expiring the session. There are no jvm related issues. During high
> load the max latency increases significantly.
> 
> Also the session expiration message is not accurate. We do have session
> expiration set to 40 sec but ZK server disconnects the client within 10 sec.
> 
> Also the logs prints throttling the request but ZK documentation says
> throttling is disabled by default. Can someone check the code once to see
> if it is enabled or disabled. I am not a developer and hence not familiar
> with java code.
> 
> Thanks
> Srikant Kalani
> 
> On Wed, 21 Apr 2021 at 11:03 AM, Michael Han <ha...@apache.org> wrote:
> 
>> What is the workload looking like? Is it pure write, or mixed read write?
>> 
>> A couple of ideas to move this forward:
>> * Publish the performance benchmark so the community can help.
>> * Bisect git commit and find the bad commit that caused the regression.
>> * Use the fine grained metrics introduced in 3.6 (e.g per processor stage
>> metrics) to measure where time spends during writes. We might have to add
>> these metrics on 3.4 to get a fair comparison.
>> 
>> For the throttling - the RequestThrottler introduced in 3.6 does introduce
>> latency, but should not impact throughput this much.
>> 
>> On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:
>> 
>>> The CPU usage of both server and client are normal (< 50%) during the
>> test.
>>> 
>>> Based on the investigation, the server is too busy with the load.
>>> 
>>> The issue doesn't exist in 3.4.14. I wonder why there is a significant
>>> write performance degradation from 3.4.14 to 3.6.2 and how we can address
>>> the issue.
>>> 
>>> Best,
>>> 
>>> Li
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org> wrote:
>>> 
>>>> What is the CPU usage of both server and client during the test?
>>>> 
>>>> Looks like server is dropping the clients because either the server or
>>>> both are too busy to deal with the load.
>>>> This log line is also concerning: "Too busy to snap, skipping”
>>>> 
>>>> If that’s the case I believe you'll have to profile the server process
>> to
>>>> figure out where the perf bottleneck is.
>>>> 
>>>> Andor
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
>>>>> 
>>>>> Thanks, Patrick.
>>>>> 
>>>>> Yes, we are using the same JVM version and GC configurations when
>>>>> running the two tests. I have checked the GC metrics and also the
>> heap
>>>> dump
>>>>> of the 3.6, the GC pause and the memory usage look okay.
>>>>> 
>>>>> Best,
>>>>> 
>>>>> Li
>>>>> 
>>>>> On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@apache.org>
>> wrote:
>>>>> 
>>>>>> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com> wrote:
>>>>>> 
>>>>>>> Hi Enrico, Sushant,
>>>>>>> 
>>>>>>> I re-run the perf test with the data consistency check feature
>>> disabled
>>>>>>> (i.e. -Dzookeeper.digest.enabled=false), the write performance
>> issue
>>> of
>>>>>> 3.6
>>>>>>> is still there.
>>>>>>> 
>>>>>>> With everything exactly the same, the throughput of 3.6 was only
>> 1/2
>>> of
>>>>>> 3.4
>>>>>>> and the max latency was more than 8 times.
>>>>>>> 
>>>>>>> Any other points or thoughts?
>>>>>>> 
>>>>>>> 
>>>>>> In the past I've noticed a big impact of GC when doing certain
>>>> performance
>>>>>> measurements. I assume you are using the same JVM version and GC
>> when
>>>>>> running the two tests? Perhaps our memory footprint has expanded
>> over
>>>> time.
>>>>>> You should rule out GC by running with gc logging turned on with
>> both
>>>>>> versions and compare the impact.
>>>>>> 
>>>>>> Regards,
>>>>>> 
>>>>>> Patrick
>>>>>> 
>>>>>> 
>>>>>>> Cheers,
>>>>>>> 
>>>>>>> Li
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com> wrote:
>>>>>>> 
>>>>>>>> Thanks Sushant and Enrico!
>>>>>>>> 
>>>>>>>> This is a really good point.  According to the 3.6 documentation,
>>> the
>>>>>>>> feature is disabled by default.
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
>>>>>>> .
>>>>>>>> However, checking the code, the default is enabled.
>>>>>>>> 
>>>>>>>> Let me set the zookeeper.digest.enabled to false and see how the
>>> write
>>>>>>>> operation performs.
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> 
>>>>>>>> Li
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
>>> sushantmane7@gmail.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi Li,
>>>>>>>>> 
>>>>>>>>> On 3.6.2 consistency checker (adhash based) is enabled by
>> default:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
>>>>>>>>> .
>>>>>>>>> It is not present in ZK 3.4.14.
>>>>>>>>> 
>>>>>>>>> This feature does have some impact on write performance.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Sushant
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
>>>> eolivelli@gmail.com
>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Li,
>>>>>>>>>> I wonder of we have some new throttling/back pressure mechanisms
>>>>>> that
>>>>>>> is
>>>>>>>>>> enabled by default.
>>>>>>>>>> 
>>>>>>>>>> Does anyone has some pointer to relevant implementations?
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Enrico
>>>>>>>>>> 
>>>>>>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha
>> scritto:
>>>>>>>>>> 
>>>>>>>>>>> Hi,
>>>>>>>>>>> 
>>>>>>>>>>> We switched to Netty on both client side and server side and
>> the
>>>>>>>>>>> performance issue is still there.  Anyone has any insights on
>>> what
>>>>>>>>> could
>>>>>>>>>> be
>>>>>>>>>>> the cause of higher latency?
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> 
>>>>>>>>>>> Li
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li...@gmail.com>
>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi Enrico,
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks for the reply.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. We are using NIO based stack, not Netty based yet.
>>>>>>>>>>>> 
>>>>>>>>>>>> 2. Yes, here are some metrics on the client side.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,
>> Max
>>>>>>>>> Latency
>>>>>>>>>>> 31s
>>>>>>>>>>>> 
>>>>>>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max
>>>>>>> Latency:
>>>>>>>>>> 1.6s
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
>>>>>>>>>>>> 
>>>>>>>>>>>> 10G of Heap
>>>>>>>>>>>> 
>>>>>>>>>>>> 13G of Memory
>>>>>>>>>>>> 
>>>>>>>>>>>> 5 Participante
>>>>>>>>>>>> 
>>>>>>>>>>>> 5 Observere
>>>>>>>>>>>> 
>>>>>>>>>>>> Client session timeout: 3000ms
>>>>>>>>>>>> 
>>>>>>>>>>>> Server min session time: 4000ms
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 4. Yes, there are two types of  WARN logs and many “Expiring
>>>>>>>>> session”
>>>>>>>>>>>> INFO log
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
>>>>>>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected exception
>>>>>>>>>>>> 
>>>>>>>>>>>> EndOfStreamException: Unable to read additional data from
>>>>>> client,
>>>>>>> it
>>>>>>>>>>>> probably closed the socket: address = /100.108.63.116:43366,
>>>>>>>>> session =
>>>>>>>>>>>> 0x400189fee9a000b
>>>>>>>>>>>> 
>>>>>>>>>>>> at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
>>>>>>>>>>>> 
>>>>>>>>>>>> at
>>>>>>>>>> 
>>>>>> 
>> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
>>>>>>>>>>>> 
>>>>>>>>>>>> at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
>>>>>>>>>>>> 
>>>>>>>>>>>> at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
>>>>>>>>>>>> 
>>>>>>>>>>>> at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>>>>>>>>>>>> 
>>>>>>>>>>>> at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>>>>>>>>>>>> 
>>>>>>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
>>>>>>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to snap,
>>>>>>>>> skipping
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
>>>>>>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
>>>>>>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
>>>>>> Actually,
>>>>>>>>> the
>>>>>>>>>>>> issue happened with the combinations of
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 3.4 client and 3.6 server
>>>>>>>>>>>> 
>>>>>>>>>>>> 3.6 client and 3.6 server
>>>>>>>>>>>> 
>>>>>>>>>>>> Please let me know if you need any additional info.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> 
>>>>>>>>>>>> Li
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <li...@gmail.com>
>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Enrico,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks for the reply.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 1. We are using direct NIO based stack, not Netty based yet.
>>>>>>>>>>>>> 2. Yes, on the client side, here are the metrics
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 3.6:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
>>>>>>>>> eolivelli@gmail.com
>>>>>>>>>>> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> IIRC The main difference is about the switch to Netty 4 and
>>>>>>> about
>>>>>>>>>> using
>>>>>>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Apart from that macro difference there have been many many
>>>>>>> changes
>>>>>>>>>>> since
>>>>>>>>>>>>>> 3.4.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Do you have some metrics to share?
>>>>>>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration equals
>>>>>> to
>>>>>>>>> each
>>>>>>>>>>>>>> other?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Do you see warnings on the server logs?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Did you upgrade both the client and the server or only the
>>>>>>> server?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Enrico
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com> ha
>>>>>>> scritto:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
>>>>>>>>> perform/load
>>>>>>>>>>>>>>> comparison test,  it was found that the performance of 3.6
>>>>>> has
>>>>>>>>> been
>>>>>>>>>>>>>>> significantly degraded compared to 3.4 for the write
>>>>>>> operation.
>>>>>>>>>> Under
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> same load, there was a huge number of SessionExpired and
>>>>>>>>>>> ConnectionLoss
>>>>>>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The load testing is 500 concurrent users with a cluster of
>> 5
>>>>>>>>>>>>>> participants
>>>>>>>>>>>>>>> and 5 observers. The min session timeout on the server side
>>>>>> is
>>>>>>>>>>> 4000ms.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I wonder if anyone has seen the same issue and has any
>>>>>>> insights
>>>>>>>>> on
>>>>>>>>>>> what
>>>>>>>>>>>>>>> could be the cause of the performance degradation.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Li
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>> 
>>

Re: write performance issue in 3.6.2

Posted by Andor Molnar <an...@apache.org>.

Hi folks,

As previously mentioned the community won’t be able to help if you don’t share more information about your scenario. We need to see the following:

- which version of Zookeeper is being used,
- how many nodes are you running in the ZK cluster,
- what is the server configuration? any custom setting is in place?
- what is the hardware and software setup? on-prem or cloud? instance type? CPU, memory, disk properties, operating system, etc.
- network characteristics
- how many clients are connected and what are they doing? share the relevant source code of your client or the command that you’re running,
- 3.6 has advanced monitoring capabilities, setup Prometheus and share screenshots of relevant metrics
- server and client logs, debug enabled if possible,
- security settings: TLS, Kerberos, etc.
- ...anything else which could be important

In a nutshell, either you have to share information about your production system or provide a reproduction setup. Performance issues are pretty hard to resolve, because of the so many moving parts. The community is willing to help, but you need to share information to be successful.

shrikant,
ZK 3.6 has throttling for both client connections and requests. Request throttling can be disabled and it’s disabled by default, but connection throttling is not. From the log messages we can tell which throttling is in effect for your scenario.

Regards,
Andor



> On 2021. Apr 21., at 5:25, shrikant kalani <sh...@gmail.com> wrote:
> 
> Hello Everyone,
> 
> We are also using zookeeper 3.6.2 with ssl turned on both sides. We
> observed the same behaviour where under high write load the ZK server
> starts expiring the session. There are no jvm related issues. During high
> load the max latency increases significantly.
> 
> Also the session expiration message is not accurate. We do have session
> expiration set to 40 sec but ZK server disconnects the client within 10 sec.
> 
> Also the logs prints throttling the request but ZK documentation says
> throttling is disabled by default. Can someone check the code once to see
> if it is enabled or disabled. I am not a developer and hence not familiar
> with java code.
> 
> Thanks
> Srikant Kalani
> 
> On Wed, 21 Apr 2021 at 11:03 AM, Michael Han <ha...@apache.org> wrote:
> 
>> What is the workload looking like? Is it pure write, or mixed read write?
>> 
>> A couple of ideas to move this forward:
>> * Publish the performance benchmark so the community can help.
>> * Bisect git commit and find the bad commit that caused the regression.
>> * Use the fine grained metrics introduced in 3.6 (e.g per processor stage
>> metrics) to measure where time spends during writes. We might have to add
>> these metrics on 3.4 to get a fair comparison.
>> 
>> For the throttling - the RequestThrottler introduced in 3.6 does introduce
>> latency, but should not impact throughput this much.
>> 
>> On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:
>> 
>>> The CPU usage of both server and client are normal (< 50%) during the
>> test.
>>> 
>>> Based on the investigation, the server is too busy with the load.
>>> 
>>> The issue doesn't exist in 3.4.14. I wonder why there is a significant
>>> write performance degradation from 3.4.14 to 3.6.2 and how we can address
>>> the issue.
>>> 
>>> Best,
>>> 
>>> Li
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org> wrote:
>>> 
>>>> What is the CPU usage of both server and client during the test?
>>>> 
>>>> Looks like server is dropping the clients because either the server or
>>>> both are too busy to deal with the load.
>>>> This log line is also concerning: "Too busy to snap, skipping”
>>>> 
>>>> If that’s the case I believe you'll have to profile the server process
>> to
>>>> figure out where the perf bottleneck is.
>>>> 
>>>> Andor
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
>>>>> 
>>>>> Thanks, Patrick.
>>>>> 
>>>>> Yes, we are using the same JVM version and GC configurations when
>>>>> running the two tests. I have checked the GC metrics and also the
>> heap
>>>> dump
>>>>> of the 3.6, the GC pause and the memory usage look okay.
>>>>> 
>>>>> Best,
>>>>> 
>>>>> Li
>>>>> 
>>>>> On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@apache.org>
>> wrote:
>>>>> 
>>>>>> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com> wrote:
>>>>>> 
>>>>>>> Hi Enrico, Sushant,
>>>>>>> 
>>>>>>> I re-run the perf test with the data consistency check feature
>>> disabled
>>>>>>> (i.e. -Dzookeeper.digest.enabled=false), the write performance
>> issue
>>> of
>>>>>> 3.6
>>>>>>> is still there.
>>>>>>> 
>>>>>>> With everything exactly the same, the throughput of 3.6 was only
>> 1/2
>>> of
>>>>>> 3.4
>>>>>>> and the max latency was more than 8 times.
>>>>>>> 
>>>>>>> Any other points or thoughts?
>>>>>>> 
>>>>>>> 
>>>>>> In the past I've noticed a big impact of GC when doing certain
>>>> performance
>>>>>> measurements. I assume you are using the same JVM version and GC
>> when
>>>>>> running the two tests? Perhaps our memory footprint has expanded
>> over
>>>> time.
>>>>>> You should rule out GC by running with gc logging turned on with
>> both
>>>>>> versions and compare the impact.
>>>>>> 
>>>>>> Regards,
>>>>>> 
>>>>>> Patrick
>>>>>> 
>>>>>> 
>>>>>>> Cheers,
>>>>>>> 
>>>>>>> Li
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com> wrote:
>>>>>>> 
>>>>>>>> Thanks Sushant and Enrico!
>>>>>>>> 
>>>>>>>> This is a really good point.  According to the 3.6 documentation,
>>> the
>>>>>>>> feature is disabled by default.
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
>>>>>>> .
>>>>>>>> However, checking the code, the default is enabled.
>>>>>>>> 
>>>>>>>> Let me set the zookeeper.digest.enabled to false and see how the
>>> write
>>>>>>>> operation performs.
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> 
>>>>>>>> Li
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
>>> sushantmane7@gmail.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi Li,
>>>>>>>>> 
>>>>>>>>> On 3.6.2 consistency checker (adhash based) is enabled by
>> default:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
>>>>>>>>> .
>>>>>>>>> It is not present in ZK 3.4.14.
>>>>>>>>> 
>>>>>>>>> This feature does have some impact on write performance.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Sushant
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
>>>> eolivelli@gmail.com
>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Li,
>>>>>>>>>> I wonder of we have some new throttling/back pressure mechanisms
>>>>>> that
>>>>>>> is
>>>>>>>>>> enabled by default.
>>>>>>>>>> 
>>>>>>>>>> Does anyone has some pointer to relevant implementations?
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Enrico
>>>>>>>>>> 
>>>>>>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha
>> scritto:
>>>>>>>>>> 
>>>>>>>>>>> Hi,
>>>>>>>>>>> 
>>>>>>>>>>> We switched to Netty on both client side and server side and
>> the
>>>>>>>>>>> performance issue is still there.  Anyone has any insights on
>>> what
>>>>>>>>> could
>>>>>>>>>> be
>>>>>>>>>>> the cause of higher latency?
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> 
>>>>>>>>>>> Li
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li...@gmail.com>
>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi Enrico,
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks for the reply.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. We are using NIO based stack, not Netty based yet.
>>>>>>>>>>>> 
>>>>>>>>>>>> 2. Yes, here are some metrics on the client side.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,
>> Max
>>>>>>>>> Latency
>>>>>>>>>>> 31s
>>>>>>>>>>>> 
>>>>>>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max
>>>>>>> Latency:
>>>>>>>>>> 1.6s
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
>>>>>>>>>>>> 
>>>>>>>>>>>> 10G of Heap
>>>>>>>>>>>> 
>>>>>>>>>>>> 13G of Memory
>>>>>>>>>>>> 
>>>>>>>>>>>> 5 Participante
>>>>>>>>>>>> 
>>>>>>>>>>>> 5 Observere
>>>>>>>>>>>> 
>>>>>>>>>>>> Client session timeout: 3000ms
>>>>>>>>>>>> 
>>>>>>>>>>>> Server min session time: 4000ms
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 4. Yes, there are two types of  WARN logs and many “Expiring
>>>>>>>>> session”
>>>>>>>>>>>> INFO log
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
>>>>>>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected exception
>>>>>>>>>>>> 
>>>>>>>>>>>> EndOfStreamException: Unable to read additional data from
>>>>>> client,
>>>>>>> it
>>>>>>>>>>>> probably closed the socket: address = /100.108.63.116:43366,
>>>>>>>>> session =
>>>>>>>>>>>> 0x400189fee9a000b
>>>>>>>>>>>> 
>>>>>>>>>>>> at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
>>>>>>>>>>>> 
>>>>>>>>>>>> at
>>>>>>>>>> 
>>>>>> 
>> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
>>>>>>>>>>>> 
>>>>>>>>>>>> at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
>>>>>>>>>>>> 
>>>>>>>>>>>> at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
>>>>>>>>>>>> 
>>>>>>>>>>>> at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>>>>>>>>>>>> 
>>>>>>>>>>>> at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>>>>>>>>>>>> 
>>>>>>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
>>>>>>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to snap,
>>>>>>>>> skipping
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
>>>>>>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
>>>>>>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
>>>>>> Actually,
>>>>>>>>> the
>>>>>>>>>>>> issue happened with the combinations of
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 3.4 client and 3.6 server
>>>>>>>>>>>> 
>>>>>>>>>>>> 3.6 client and 3.6 server
>>>>>>>>>>>> 
>>>>>>>>>>>> Please let me know if you need any additional info.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> 
>>>>>>>>>>>> Li
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <li...@gmail.com>
>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Enrico,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks for the reply.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 1. We are using direct NIO based stack, not Netty based yet.
>>>>>>>>>>>>> 2. Yes, on the client side, here are the metrics
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 3.6:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
>>>>>>>>> eolivelli@gmail.com
>>>>>>>>>>> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> IIRC The main difference is about the switch to Netty 4 and
>>>>>>> about
>>>>>>>>>> using
>>>>>>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Apart from that macro difference there have been many many
>>>>>>> changes
>>>>>>>>>>> since
>>>>>>>>>>>>>> 3.4.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Do you have some metrics to share?
>>>>>>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration equals
>>>>>> to
>>>>>>>>> each
>>>>>>>>>>>>>> other?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Do you see warnings on the server logs?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Did you upgrade both the client and the server or only the
>>>>>>> server?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Enrico
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com> ha
>>>>>>> scritto:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
>>>>>>>>> perform/load
>>>>>>>>>>>>>>> comparison test,  it was found that the performance of 3.6
>>>>>> has
>>>>>>>>> been
>>>>>>>>>>>>>>> significantly degraded compared to 3.4 for the write
>>>>>>> operation.
>>>>>>>>>> Under
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> same load, there was a huge number of SessionExpired and
>>>>>>>>>>> ConnectionLoss
>>>>>>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The load testing is 500 concurrent users with a cluster of
>> 5
>>>>>>>>>>>>>> participants
>>>>>>>>>>>>>>> and 5 observers. The min session timeout on the server side
>>>>>> is
>>>>>>>>>>> 4000ms.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I wonder if anyone has seen the same issue and has any
>>>>>>> insights
>>>>>>>>> on
>>>>>>>>>>> what
>>>>>>>>>>>>>>> could be the cause of the performance degradation.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Li
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>> 
>>

Re: write performance issue in 3.6.2

Posted by Li Wang <li...@gmail.com>.

Hi Srikant,

1. Have you tried to run the test without enabling Prometheus metrics? What
I observed that enabling Prometheus has significant performance impact
(about 40%-60% degradation)
2. In addition to the session expiry errors and max latency increasing
issue, did you see any issue with throughput?
3. Request throttling is disabled by default, however all the requests go
through RequestThrottler. Here is the code.

private static volatile int maxRequests =
Integer.getInteger("zookeeper.request_throttle_max_requests", 0);

4. What are the request throttling logs you've seen?

Best,

Li





On Tue, Apr 20, 2021 at 9:06 PM shrikant kalani <sh...@gmail.com>
wrote:

> Hello Everyone,
>
> We are also using zookeeper 3.6.2 with ssl turned on both sides. We
> observed the same behaviour where under high write load the ZK server
> starts expiring the session. There are no jvm related issues. During high
> load the max latency increases significantly.
>
> Also the session expiration message is not accurate. We do have session
> expiration set to 40 sec but ZK server disconnects the client within 10
> sec.
>
> Also the logs prints throttling the request but ZK documentation says
> throttling is disabled by default. Can someone check the code once to see
> if it is enabled or disabled. I am not a developer and hence not familiar
> with java code.
>
> Thanks
> Srikant Kalani
>
> On Wed, 21 Apr 2021 at 11:03 AM, Michael Han <ha...@apache.org> wrote:
>
> > What is the workload looking like? Is it pure write, or mixed read write?
> >
> > A couple of ideas to move this forward:
> > * Publish the performance benchmark so the community can help.
> > * Bisect git commit and find the bad commit that caused the regression.
> > * Use the fine grained metrics introduced in 3.6 (e.g per processor stage
> > metrics) to measure where time spends during writes. We might have to add
> > these metrics on 3.4 to get a fair comparison.
> >
> > For the throttling - the RequestThrottler introduced in 3.6 does
> introduce
> > latency, but should not impact throughput this much.
> >
> > On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:
> >
> > > The CPU usage of both server and client are normal (< 50%) during the
> > test.
> > >
> > > Based on the investigation, the server is too busy with the load.
> > >
> > > The issue doesn't exist in 3.4.14. I wonder why there is a significant
> > > write performance degradation from 3.4.14 to 3.6.2 and how we can
> address
> > > the issue.
> > >
> > > Best,
> > >
> > > Li
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org>
> wrote:
> > >
> > > > What is the CPU usage of both server and client during the test?
> > > >
> > > > Looks like server is dropping the clients because either the server
> or
> > > > both are too busy to deal with the load.
> > > > This log line is also concerning: "Too busy to snap, skipping”
> > > >
> > > > If that’s the case I believe you'll have to profile the server
> process
> > to
> > > > figure out where the perf bottleneck is.
> > > >
> > > > Andor
> > > >
> > > >
> > > >
> > > >
> > > > > On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
> > > > >
> > > > > Thanks, Patrick.
> > > > >
> > > > > Yes, we are using the same JVM version and GC configurations when
> > > > > running the two tests. I have checked the GC metrics and also the
> > heap
> > > > dump
> > > > > of the 3.6, the GC pause and the memory usage look okay.
> > > > >
> > > > > Best,
> > > > >
> > > > > Li
> > > > >
> > > > > On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@apache.org>
> > wrote:
> > > > >
> > > > >> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com>
> wrote:
> > > > >>
> > > > >>> Hi Enrico, Sushant,
> > > > >>>
> > > > >>> I re-run the perf test with the data consistency check feature
> > > disabled
> > > > >>> (i.e. -Dzookeeper.digest.enabled=false), the write performance
> > issue
> > > of
> > > > >> 3.6
> > > > >>> is still there.
> > > > >>>
> > > > >>> With everything exactly the same, the throughput of 3.6 was only
> > 1/2
> > > of
> > > > >> 3.4
> > > > >>> and the max latency was more than 8 times.
> > > > >>>
> > > > >>> Any other points or thoughts?
> > > > >>>
> > > > >>>
> > > > >> In the past I've noticed a big impact of GC when doing certain
> > > > performance
> > > > >> measurements. I assume you are using the same JVM version and GC
> > when
> > > > >> running the two tests? Perhaps our memory footprint has expanded
> > over
> > > > time.
> > > > >> You should rule out GC by running with gc logging turned on with
> > both
> > > > >> versions and compare the impact.
> > > > >>
> > > > >> Regards,
> > > > >>
> > > > >> Patrick
> > > > >>
> > > > >>
> > > > >>> Cheers,
> > > > >>>
> > > > >>> Li
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com>
> wrote:
> > > > >>>
> > > > >>>> Thanks Sushant and Enrico!
> > > > >>>>
> > > > >>>> This is a really good point.  According to the 3.6
> documentation,
> > > the
> > > > >>>> feature is disabled by default.
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
> > > > >>> .
> > > > >>>> However, checking the code, the default is enabled.
> > > > >>>>
> > > > >>>> Let me set the zookeeper.digest.enabled to false and see how the
> > > write
> > > > >>>> operation performs.
> > > > >>>>
> > > > >>>> Best,
> > > > >>>>
> > > > >>>> Li
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
> > > sushantmane7@gmail.com>
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>> Hi Li,
> > > > >>>>>
> > > > >>>>> On 3.6.2 consistency checker (adhash based) is enabled by
> > default:
> > > > >>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
> > > > >>>>> .
> > > > >>>>> It is not present in ZK 3.4.14.
> > > > >>>>>
> > > > >>>>> This feature does have some impact on write performance.
> > > > >>>>>
> > > > >>>>> Thanks,
> > > > >>>>> Sushant
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
> > > > eolivelli@gmail.com
> > > > >>>
> > > > >>>>> wrote:
> > > > >>>>>
> > > > >>>>>> Li,
> > > > >>>>>> I wonder of we have some new throttling/back pressure
> mechanisms
> > > > >> that
> > > > >>> is
> > > > >>>>>> enabled by default.
> > > > >>>>>>
> > > > >>>>>> Does anyone has some pointer to relevant implementations?
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> Enrico
> > > > >>>>>>
> > > > >>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha
> > scritto:
> > > > >>>>>>
> > > > >>>>>>> Hi,
> > > > >>>>>>>
> > > > >>>>>>> We switched to Netty on both client side and server side and
> > the
> > > > >>>>>>> performance issue is still there.  Anyone has any insights on
> > > what
> > > > >>>>> could
> > > > >>>>>> be
> > > > >>>>>>> the cause of higher latency?
> > > > >>>>>>>
> > > > >>>>>>> Thanks,
> > > > >>>>>>>
> > > > >>>>>>> Li
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li...@gmail.com>
> > > > >> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> Hi Enrico,
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> Thanks for the reply.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> > > > >>>>>>>>
> > > > >>>>>>>> 2. Yes, here are some metrics on the client side.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,
> > Max
> > > > >>>>> Latency
> > > > >>>>>>> 31s
> > > > >>>>>>>>
> > > > >>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max
> > > > >>> Latency:
> > > > >>>>>> 1.6s
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> > > > >>>>>>>>
> > > > >>>>>>>> 10G of Heap
> > > > >>>>>>>>
> > > > >>>>>>>> 13G of Memory
> > > > >>>>>>>>
> > > > >>>>>>>> 5 Participante
> > > > >>>>>>>>
> > > > >>>>>>>> 5 Observere
> > > > >>>>>>>>
> > > > >>>>>>>> Client session timeout: 3000ms
> > > > >>>>>>>>
> > > > >>>>>>>> Server min session time: 4000ms
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 4. Yes, there are two types of  WARN logs and many “Expiring
> > > > >>>>> session”
> > > > >>>>>>>> INFO log
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> > > > >>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected
> exception
> > > > >>>>>>>>
> > > > >>>>>>>> EndOfStreamException: Unable to read additional data from
> > > > >> client,
> > > > >>> it
> > > > >>>>>>>> probably closed the socket: address = /100.108.63.116:43366
> ,
> > > > >>>>> session =
> > > > >>>>>>>> 0x400189fee9a000b
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>
> > > > >>
> > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> > > > >>>>>>>>
> > > > >>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> > > > >>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to snap,
> > > > >>>>> skipping
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> > > > >>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> > > > >>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
> > > > >> Actually,
> > > > >>>>> the
> > > > >>>>>>>> issue happened with the combinations of
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3.4 client and 3.6 server
> > > > >>>>>>>>
> > > > >>>>>>>> 3.6 client and 3.6 server
> > > > >>>>>>>>
> > > > >>>>>>>> Please let me know if you need any additional info.
> > > > >>>>>>>>
> > > > >>>>>>>> Thanks,
> > > > >>>>>>>>
> > > > >>>>>>>> Li
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <li...@gmail.com>
> > > > >>> wrote:
> > > > >>>>>>>>
> > > > >>>>>>>>> Hi Enrico,
> > > > >>>>>>>>>
> > > > >>>>>>>>> Thanks for the reply.
> > > > >>>>>>>>>
> > > > >>>>>>>>> 1. We are using direct NIO based stack, not Netty based
> yet.
> > > > >>>>>>>>> 2. Yes, on the client side, here are the metrics
> > > > >>>>>>>>>
> > > > >>>>>>>>> 3.6:
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
> > > > >>>>> eolivelli@gmail.com
> > > > >>>>>>>
> > > > >>>>>>>>> wrote:
> > > > >>>>>>>>>
> > > > >>>>>>>>>> IIRC The main difference is about the switch to Netty 4
> and
> > > > >>> about
> > > > >>>>>> using
> > > > >>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Apart from that macro difference there have been many many
> > > > >>> changes
> > > > >>>>>>> since
> > > > >>>>>>>>>> 3.4.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Do you have some metrics to share?
> > > > >>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration
> equals
> > > > >> to
> > > > >>>>> each
> > > > >>>>>>>>>> other?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Do you see warnings on the server logs?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Did you upgrade both the client and the server or only the
> > > > >>> server?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Enrico
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com> ha
> > > > >>> scritto:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> Hi,
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
> > > > >>>>> perform/load
> > > > >>>>>>>>>>> comparison test,  it was found that the performance of
> 3.6
> > > > >> has
> > > > >>>>> been
> > > > >>>>>>>>>>> significantly degraded compared to 3.4 for the write
> > > > >>> operation.
> > > > >>>>>> Under
> > > > >>>>>>>>>> the
> > > > >>>>>>>>>>> same load, there was a huge number of SessionExpired and
> > > > >>>>>>> ConnectionLoss
> > > > >>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> The load testing is 500 concurrent users with a cluster
> of
> > 5
> > > > >>>>>>>>>> participants
> > > > >>>>>>>>>>> and 5 observers. The min session timeout on the server
> side
> > > > >> is
> > > > >>>>>>> 4000ms.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> I wonder if anyone has seen the same issue and has any
> > > > >>> insights
> > > > >>>>> on
> > > > >>>>>>> what
> > > > >>>>>>>>>>> could be the cause of the performance degradation.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Thanks
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Li
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > > >
> > >
> >
>

Re: write performance issue in 3.6.2

Posted by shrikant kalani <sh...@gmail.com>.

Hello Everyone,

We are also using zookeeper 3.6.2 with ssl turned on both sides. We
observed the same behaviour where under high write load the ZK server
starts expiring the session. There are no jvm related issues. During high
load the max latency increases significantly.

Also the session expiration message is not accurate. We do have session
expiration set to 40 sec but ZK server disconnects the client within 10 sec.

Also the logs prints throttling the request but ZK documentation says
throttling is disabled by default. Can someone check the code once to see
if it is enabled or disabled. I am not a developer and hence not familiar
with java code.

Thanks
Srikant Kalani

On Wed, 21 Apr 2021 at 11:03 AM, Michael Han <ha...@apache.org> wrote:

> What is the workload looking like? Is it pure write, or mixed read write?
>
> A couple of ideas to move this forward:
> * Publish the performance benchmark so the community can help.
> * Bisect git commit and find the bad commit that caused the regression.
> * Use the fine grained metrics introduced in 3.6 (e.g per processor stage
> metrics) to measure where time spends during writes. We might have to add
> these metrics on 3.4 to get a fair comparison.
>
> For the throttling - the RequestThrottler introduced in 3.6 does introduce
> latency, but should not impact throughput this much.
>
> On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:
>
> > The CPU usage of both server and client are normal (< 50%) during the
> test.
> >
> > Based on the investigation, the server is too busy with the load.
> >
> > The issue doesn't exist in 3.4.14. I wonder why there is a significant
> > write performance degradation from 3.4.14 to 3.6.2 and how we can address
> > the issue.
> >
> > Best,
> >
> > Li
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org> wrote:
> >
> > > What is the CPU usage of both server and client during the test?
> > >
> > > Looks like server is dropping the clients because either the server or
> > > both are too busy to deal with the load.
> > > This log line is also concerning: "Too busy to snap, skipping”
> > >
> > > If that’s the case I believe you'll have to profile the server process
> to
> > > figure out where the perf bottleneck is.
> > >
> > > Andor
> > >
> > >
> > >
> > >
> > > > On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
> > > >
> > > > Thanks, Patrick.
> > > >
> > > > Yes, we are using the same JVM version and GC configurations when
> > > > running the two tests. I have checked the GC metrics and also the
> heap
> > > dump
> > > > of the 3.6, the GC pause and the memory usage look okay.
> > > >
> > > > Best,
> > > >
> > > > Li
> > > >
> > > > On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@apache.org>
> wrote:
> > > >
> > > >> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com> wrote:
> > > >>
> > > >>> Hi Enrico, Sushant,
> > > >>>
> > > >>> I re-run the perf test with the data consistency check feature
> > disabled
> > > >>> (i.e. -Dzookeeper.digest.enabled=false), the write performance
> issue
> > of
> > > >> 3.6
> > > >>> is still there.
> > > >>>
> > > >>> With everything exactly the same, the throughput of 3.6 was only
> 1/2
> > of
> > > >> 3.4
> > > >>> and the max latency was more than 8 times.
> > > >>>
> > > >>> Any other points or thoughts?
> > > >>>
> > > >>>
> > > >> In the past I've noticed a big impact of GC when doing certain
> > > performance
> > > >> measurements. I assume you are using the same JVM version and GC
> when
> > > >> running the two tests? Perhaps our memory footprint has expanded
> over
> > > time.
> > > >> You should rule out GC by running with gc logging turned on with
> both
> > > >> versions and compare the impact.
> > > >>
> > > >> Regards,
> > > >>
> > > >> Patrick
> > > >>
> > > >>
> > > >>> Cheers,
> > > >>>
> > > >>> Li
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com> wrote:
> > > >>>
> > > >>>> Thanks Sushant and Enrico!
> > > >>>>
> > > >>>> This is a really good point.  According to the 3.6 documentation,
> > the
> > > >>>> feature is disabled by default.
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
> > > >>> .
> > > >>>> However, checking the code, the default is enabled.
> > > >>>>
> > > >>>> Let me set the zookeeper.digest.enabled to false and see how the
> > write
> > > >>>> operation performs.
> > > >>>>
> > > >>>> Best,
> > > >>>>
> > > >>>> Li
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
> > sushantmane7@gmail.com>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Hi Li,
> > > >>>>>
> > > >>>>> On 3.6.2 consistency checker (adhash based) is enabled by
> default:
> > > >>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
> > > >>>>> .
> > > >>>>> It is not present in ZK 3.4.14.
> > > >>>>>
> > > >>>>> This feature does have some impact on write performance.
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>> Sushant
> > > >>>>>
> > > >>>>>
> > > >>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
> > > eolivelli@gmail.com
> > > >>>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>>> Li,
> > > >>>>>> I wonder of we have some new throttling/back pressure mechanisms
> > > >> that
> > > >>> is
> > > >>>>>> enabled by default.
> > > >>>>>>
> > > >>>>>> Does anyone has some pointer to relevant implementations?
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> Enrico
> > > >>>>>>
> > > >>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha
> scritto:
> > > >>>>>>
> > > >>>>>>> Hi,
> > > >>>>>>>
> > > >>>>>>> We switched to Netty on both client side and server side and
> the
> > > >>>>>>> performance issue is still there.  Anyone has any insights on
> > what
> > > >>>>> could
> > > >>>>>> be
> > > >>>>>>> the cause of higher latency?
> > > >>>>>>>
> > > >>>>>>> Thanks,
> > > >>>>>>>
> > > >>>>>>> Li
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li...@gmail.com>
> > > >> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi Enrico,
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> Thanks for the reply.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> > > >>>>>>>>
> > > >>>>>>>> 2. Yes, here are some metrics on the client side.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,
> Max
> > > >>>>> Latency
> > > >>>>>>> 31s
> > > >>>>>>>>
> > > >>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max
> > > >>> Latency:
> > > >>>>>> 1.6s
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> > > >>>>>>>>
> > > >>>>>>>> 10G of Heap
> > > >>>>>>>>
> > > >>>>>>>> 13G of Memory
> > > >>>>>>>>
> > > >>>>>>>> 5 Participante
> > > >>>>>>>>
> > > >>>>>>>> 5 Observere
> > > >>>>>>>>
> > > >>>>>>>> Client session timeout: 3000ms
> > > >>>>>>>>
> > > >>>>>>>> Server min session time: 4000ms
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 4. Yes, there are two types of  WARN logs and many “Expiring
> > > >>>>> session”
> > > >>>>>>>> INFO log
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> > > >>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected exception
> > > >>>>>>>>
> > > >>>>>>>> EndOfStreamException: Unable to read additional data from
> > > >> client,
> > > >>> it
> > > >>>>>>>> probably closed the socket: address = /100.108.63.116:43366,
> > > >>>>> session =
> > > >>>>>>>> 0x400189fee9a000b
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>
> > > >>
> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> > > >>>>>>>>
> > > >>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> > > >>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to snap,
> > > >>>>> skipping
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> > > >>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> > > >>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
> > > >> Actually,
> > > >>>>> the
> > > >>>>>>>> issue happened with the combinations of
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 3.4 client and 3.6 server
> > > >>>>>>>>
> > > >>>>>>>> 3.6 client and 3.6 server
> > > >>>>>>>>
> > > >>>>>>>> Please let me know if you need any additional info.
> > > >>>>>>>>
> > > >>>>>>>> Thanks,
> > > >>>>>>>>
> > > >>>>>>>> Li
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <li...@gmail.com>
> > > >>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Hi Enrico,
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks for the reply.
> > > >>>>>>>>>
> > > >>>>>>>>> 1. We are using direct NIO based stack, not Netty based yet.
> > > >>>>>>>>> 2. Yes, on the client side, here are the metrics
> > > >>>>>>>>>
> > > >>>>>>>>> 3.6:
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
> > > >>>>> eolivelli@gmail.com
> > > >>>>>>>
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> IIRC The main difference is about the switch to Netty 4 and
> > > >>> about
> > > >>>>>> using
> > > >>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Apart from that macro difference there have been many many
> > > >>> changes
> > > >>>>>>> since
> > > >>>>>>>>>> 3.4.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Do you have some metrics to share?
> > > >>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration equals
> > > >> to
> > > >>>>> each
> > > >>>>>>>>>> other?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Do you see warnings on the server logs?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Did you upgrade both the client and the server or only the
> > > >>> server?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Enrico
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com> ha
> > > >>> scritto:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Hi,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
> > > >>>>> perform/load
> > > >>>>>>>>>>> comparison test,  it was found that the performance of 3.6
> > > >> has
> > > >>>>> been
> > > >>>>>>>>>>> significantly degraded compared to 3.4 for the write
> > > >>> operation.
> > > >>>>>> Under
> > > >>>>>>>>>> the
> > > >>>>>>>>>>> same load, there was a huge number of SessionExpired and
> > > >>>>>>> ConnectionLoss
> > > >>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> The load testing is 500 concurrent users with a cluster of
> 5
> > > >>>>>>>>>> participants
> > > >>>>>>>>>>> and 5 observers. The min session timeout on the server side
> > > >> is
> > > >>>>>>> 4000ms.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I wonder if anyone has seen the same issue and has any
> > > >>> insights
> > > >>>>> on
> > > >>>>>>> what
> > > >>>>>>>>>>> could be the cause of the performance degradation.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Thanks
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Li
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> > >
> >
>

Re: write performance issue in 3.6.2

Posted by shrikant kalani <sh...@gmail.com>.

Hello Everyone,

We are also using zookeeper 3.6.2 with ssl turned on both sides. We
observed the same behaviour where under high write load the ZK server
starts expiring the session. There are no jvm related issues. During high
load the max latency increases significantly.

Also the session expiration message is not accurate. We do have session
expiration set to 40 sec but ZK server disconnects the client within 10 sec.

Also the logs prints throttling the request but ZK documentation says
throttling is disabled by default. Can someone check the code once to see
if it is enabled or disabled. I am not a developer and hence not familiar
with java code.

Thanks
Srikant Kalani

On Wed, 21 Apr 2021 at 11:03 AM, Michael Han <ha...@apache.org> wrote:

> What is the workload looking like? Is it pure write, or mixed read write?
>
> A couple of ideas to move this forward:
> * Publish the performance benchmark so the community can help.
> * Bisect git commit and find the bad commit that caused the regression.
> * Use the fine grained metrics introduced in 3.6 (e.g per processor stage
> metrics) to measure where time spends during writes. We might have to add
> these metrics on 3.4 to get a fair comparison.
>
> For the throttling - the RequestThrottler introduced in 3.6 does introduce
> latency, but should not impact throughput this much.
>
> On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:
>
> > The CPU usage of both server and client are normal (< 50%) during the
> test.
> >
> > Based on the investigation, the server is too busy with the load.
> >
> > The issue doesn't exist in 3.4.14. I wonder why there is a significant
> > write performance degradation from 3.4.14 to 3.6.2 and how we can address
> > the issue.
> >
> > Best,
> >
> > Li
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org> wrote:
> >
> > > What is the CPU usage of both server and client during the test?
> > >
> > > Looks like server is dropping the clients because either the server or
> > > both are too busy to deal with the load.
> > > This log line is also concerning: "Too busy to snap, skipping”
> > >
> > > If that’s the case I believe you'll have to profile the server process
> to
> > > figure out where the perf bottleneck is.
> > >
> > > Andor
> > >
> > >
> > >
> > >
> > > > On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
> > > >
> > > > Thanks, Patrick.
> > > >
> > > > Yes, we are using the same JVM version and GC configurations when
> > > > running the two tests. I have checked the GC metrics and also the
> heap
> > > dump
> > > > of the 3.6, the GC pause and the memory usage look okay.
> > > >
> > > > Best,
> > > >
> > > > Li
> > > >
> > > > On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@apache.org>
> wrote:
> > > >
> > > >> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com> wrote:
> > > >>
> > > >>> Hi Enrico, Sushant,
> > > >>>
> > > >>> I re-run the perf test with the data consistency check feature
> > disabled
> > > >>> (i.e. -Dzookeeper.digest.enabled=false), the write performance
> issue
> > of
> > > >> 3.6
> > > >>> is still there.
> > > >>>
> > > >>> With everything exactly the same, the throughput of 3.6 was only
> 1/2
> > of
> > > >> 3.4
> > > >>> and the max latency was more than 8 times.
> > > >>>
> > > >>> Any other points or thoughts?
> > > >>>
> > > >>>
> > > >> In the past I've noticed a big impact of GC when doing certain
> > > performance
> > > >> measurements. I assume you are using the same JVM version and GC
> when
> > > >> running the two tests? Perhaps our memory footprint has expanded
> over
> > > time.
> > > >> You should rule out GC by running with gc logging turned on with
> both
> > > >> versions and compare the impact.
> > > >>
> > > >> Regards,
> > > >>
> > > >> Patrick
> > > >>
> > > >>
> > > >>> Cheers,
> > > >>>
> > > >>> Li
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com> wrote:
> > > >>>
> > > >>>> Thanks Sushant and Enrico!
> > > >>>>
> > > >>>> This is a really good point.  According to the 3.6 documentation,
> > the
> > > >>>> feature is disabled by default.
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
> > > >>> .
> > > >>>> However, checking the code, the default is enabled.
> > > >>>>
> > > >>>> Let me set the zookeeper.digest.enabled to false and see how the
> > write
> > > >>>> operation performs.
> > > >>>>
> > > >>>> Best,
> > > >>>>
> > > >>>> Li
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
> > sushantmane7@gmail.com>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Hi Li,
> > > >>>>>
> > > >>>>> On 3.6.2 consistency checker (adhash based) is enabled by
> default:
> > > >>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
> > > >>>>> .
> > > >>>>> It is not present in ZK 3.4.14.
> > > >>>>>
> > > >>>>> This feature does have some impact on write performance.
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>> Sushant
> > > >>>>>
> > > >>>>>
> > > >>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
> > > eolivelli@gmail.com
> > > >>>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>>> Li,
> > > >>>>>> I wonder of we have some new throttling/back pressure mechanisms
> > > >> that
> > > >>> is
> > > >>>>>> enabled by default.
> > > >>>>>>
> > > >>>>>> Does anyone has some pointer to relevant implementations?
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> Enrico
> > > >>>>>>
> > > >>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha
> scritto:
> > > >>>>>>
> > > >>>>>>> Hi,
> > > >>>>>>>
> > > >>>>>>> We switched to Netty on both client side and server side and
> the
> > > >>>>>>> performance issue is still there.  Anyone has any insights on
> > what
> > > >>>>> could
> > > >>>>>> be
> > > >>>>>>> the cause of higher latency?
> > > >>>>>>>
> > > >>>>>>> Thanks,
> > > >>>>>>>
> > > >>>>>>> Li
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li...@gmail.com>
> > > >> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi Enrico,
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> Thanks for the reply.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> > > >>>>>>>>
> > > >>>>>>>> 2. Yes, here are some metrics on the client side.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,
> Max
> > > >>>>> Latency
> > > >>>>>>> 31s
> > > >>>>>>>>
> > > >>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max
> > > >>> Latency:
> > > >>>>>> 1.6s
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> > > >>>>>>>>
> > > >>>>>>>> 10G of Heap
> > > >>>>>>>>
> > > >>>>>>>> 13G of Memory
> > > >>>>>>>>
> > > >>>>>>>> 5 Participante
> > > >>>>>>>>
> > > >>>>>>>> 5 Observere
> > > >>>>>>>>
> > > >>>>>>>> Client session timeout: 3000ms
> > > >>>>>>>>
> > > >>>>>>>> Server min session time: 4000ms
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 4. Yes, there are two types of  WARN logs and many “Expiring
> > > >>>>> session”
> > > >>>>>>>> INFO log
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> > > >>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected exception
> > > >>>>>>>>
> > > >>>>>>>> EndOfStreamException: Unable to read additional data from
> > > >> client,
> > > >>> it
> > > >>>>>>>> probably closed the socket: address = /100.108.63.116:43366,
> > > >>>>> session =
> > > >>>>>>>> 0x400189fee9a000b
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>
> > > >>
> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> > > >>>>>>>>
> > > >>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> > > >>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to snap,
> > > >>>>> skipping
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> > > >>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> > > >>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
> > > >> Actually,
> > > >>>>> the
> > > >>>>>>>> issue happened with the combinations of
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 3.4 client and 3.6 server
> > > >>>>>>>>
> > > >>>>>>>> 3.6 client and 3.6 server
> > > >>>>>>>>
> > > >>>>>>>> Please let me know if you need any additional info.
> > > >>>>>>>>
> > > >>>>>>>> Thanks,
> > > >>>>>>>>
> > > >>>>>>>> Li
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <li...@gmail.com>
> > > >>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Hi Enrico,
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks for the reply.
> > > >>>>>>>>>
> > > >>>>>>>>> 1. We are using direct NIO based stack, not Netty based yet.
> > > >>>>>>>>> 2. Yes, on the client side, here are the metrics
> > > >>>>>>>>>
> > > >>>>>>>>> 3.6:
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
> > > >>>>> eolivelli@gmail.com
> > > >>>>>>>
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> IIRC The main difference is about the switch to Netty 4 and
> > > >>> about
> > > >>>>>> using
> > > >>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Apart from that macro difference there have been many many
> > > >>> changes
> > > >>>>>>> since
> > > >>>>>>>>>> 3.4.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Do you have some metrics to share?
> > > >>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration equals
> > > >> to
> > > >>>>> each
> > > >>>>>>>>>> other?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Do you see warnings on the server logs?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Did you upgrade both the client and the server or only the
> > > >>> server?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Enrico
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com> ha
> > > >>> scritto:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Hi,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
> > > >>>>> perform/load
> > > >>>>>>>>>>> comparison test,  it was found that the performance of 3.6
> > > >> has
> > > >>>>> been
> > > >>>>>>>>>>> significantly degraded compared to 3.4 for the write
> > > >>> operation.
> > > >>>>>> Under
> > > >>>>>>>>>> the
> > > >>>>>>>>>>> same load, there was a huge number of SessionExpired and
> > > >>>>>>> ConnectionLoss
> > > >>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> The load testing is 500 concurrent users with a cluster of
> 5
> > > >>>>>>>>>> participants
> > > >>>>>>>>>>> and 5 observers. The min session timeout on the server side
> > > >> is
> > > >>>>>>> 4000ms.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I wonder if anyone has seen the same issue and has any
> > > >>> insights
> > > >>>>> on
> > > >>>>>>> what
> > > >>>>>>>>>>> could be the cause of the performance degradation.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Thanks
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Li
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> > >
> >
>

Re: write performance issue in 3.6.2

Posted by Michael Han <ha...@apache.org>.

>> because the tests were run with Prometheus enabled,  which is new in 3.6
and has significant negative perf impact.

Interesting, let's see what the numbers are without Prometheus involved. It
could be that the increased latency we observed in CommitProcessor is just
a symptom rather than the cause if Prometheus is the culprit.

In my production environment we use an in house in-process metrics library
which hasn't caused any trouble for performance so far.

>> Does it measure how long a local write op takes in the commit processor
phase?

Yes.

>> I don't think the writes can be processed concurrently.

Correct. There can only be a single inflight write operation that's
committing at any given time. The concurrency is between a single write and
multiple reads where these reads belong to different sessions. Previously,
in 3.4, a single write will block processing of every other operation
including both writes and reads, and in 3.6, a single write will only block
reads that's from the same session; reads from other sessions can proceed
concurrently.

>> Is it correct to say that the new CommitProcessor works best for the
reads in the read/write workloads scenario,

Yes, as previously mentioned, a single write will now only
conditionally block other reads as opposed to unanimously block all reads
which increase overall throughput.


On Mon, May 3, 2021 at 5:06 PM Li Wang <li...@gmail.com> wrote:

> Hi Michael,
>
> Thanks for your additional inputs.
>
> On Mon, May 3, 2021 at 3:13 PM Michael Han <ha...@apache.org> wrote:
>
> > Hi Li,
> >
> > Thanks for following up.
> >
> > >> write_commitproc_time_ms were large
> >
> > This measures how long a local write op hears back from the leader. If
> it's
> > big, then either the leader is very busy acking the request, or your
> > network RTT is high.
> >
>
> Does it measure how long a local write op takes in the commit processor
> phase?
>
> ServerMetrics.getMetrics().WRITE_COMMITPROC_TIME.add(currentTime -
> request.commitProcQueueStartTime);
>
>
> > How does the local fsync time (fsynctime) look like between two tests?
> >
>
> The fsync time looks similar between two tests.
>
> >
> > >> We've found that increasing the maxCommitBatchSize
> >
> > Are you able to successfully tune this value so your benchmark against
> 3.6
> > is on par with 3.4 now? The original report mentioned lots of session
> > timeout and con loss. I am wondering if we can fix this first by tuning
> > this size.
> >
>
> We ran load tests with maxCommitBatchSize as 500 vs 1. 500 was used as we
> have 500 concurrent users in the load test.
> The connection loss error count was reduced about 40% and the session
> expired error was reduced about 45%.
> Tuning maxCommitBatchSize can significantly reduce the errors. We can not
> say the benchmark is on par with 3.4 (i.e. no errors)
> because the tests were run with Prometheus enabled,  which is new in 3.6
> and has significant negative perf impact.
> We will run tests with Prometheus disabled and maxCommitBatchSize as 500
> when we get a chance.
>
> >
> > The major difference of CommitProcessor between 3.4.14 and 3.6.2 is the
> > newly added per session commit queue such that reads and a write from
> > different sessions can be processed concurrently.
>
> Yes, I noticed that there is a queue per session in 3.6, but I don't think
> the writes can be processed concurrently.
> The CommitProcessor is single threaded and the CommitProc worker threads
> are only for reads. Did I miss anything?
>
> This works best for mixed
> > read / write workloads, but for pure write workloads, the new
> > CommitProcessor is not superior, as all writes still have to be
> serialized
> > due to global ordering, plus the per session queue has the overhead for
> > example in this test ZK has to manage 500 queues (and enqueue / dequeue
> and
> > so on cost cycles). Though, I didn't expect this overhead can create
> such a
> > big difference in your test..
> >
>
> Is it correct to say that the new CommitProcessor works best for the reads
> in the read/write workloads scenario,
> as only the reads can be processed concurrently?
>
>
> > Also this is obvious but just want to confirm if the benchmark for two
> > versions of ZK was done on exact same test environment including same OS
> > and networking configuration?
> >
>
> Yes, the benchmark for the two versions was done on the same test
> environment and configuration.
>
> >
> > On Mon, Apr 26, 2021 at 7:35 PM Li Wang <li...@gmail.com> wrote:
> >
> > > Hi Michael,
> > >
> > > Thanks for your reply.
> > >
> > > 1. The workload is 500 concurrent users creating nodes with data size
> of
> > 4
> > > bytes.
> > > 2. It's pure write
> > > 3. The perf issue is that under the same load, there were many session
> > > expired and connection loss errors when using ZK 3.6.2 but no such
> errors
> > > in ZK 3.4.14.
> > >
> > > The following are some updates on the issue.
> > >
> > > 1. We've checked the fine grained metrics and found that the
> > > CommitProcessor was the bottleneck. The commit_commit_proc_req_queued
> and
> > > the write_commitproc_time_ms were large.
> > > The errors were caused by too many commit requests queued up in the
> > > CommitProcessor and waiting to be processed.
> > > 2. We've found that increasing the maxCommitBatchSize can reduce both
> the
> > > session expired and connection loss errors.
> > > 3. We didn't observe any significant perf impact from the
> > RequestThrottler.
> > >
> > >
> > > Please let me know if you or anyone has any questions.
> > >
> > > Thanks,
> > >
> > > Li
> > >
> > >
> > >
> > > On Tue, Apr 20, 2021 at 8:03 PM Michael Han <ha...@apache.org> wrote:
> > >
> > > > What is the workload looking like? Is it pure write, or mixed read
> > write?
> > > >
> > > > A couple of ideas to move this forward:
> > > > * Publish the performance benchmark so the community can help.
> > > > * Bisect git commit and find the bad commit that caused the
> regression.
> > > > * Use the fine grained metrics introduced in 3.6 (e.g per processor
> > stage
> > > > metrics) to measure where time spends during writes. We might have to
> > add
> > > > these metrics on 3.4 to get a fair comparison.
> > > >
> > > > For the throttling - the RequestThrottler introduced in 3.6 does
> > > introduce
> > > > latency, but should not impact throughput this much.
> > > >
> > > > On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:
> > > >
> > > > > The CPU usage of both server and client are normal (< 50%) during
> the
> > > > test.
> > > > >
> > > > > Based on the investigation, the server is too busy with the load.
> > > > >
> > > > > The issue doesn't exist in 3.4.14. I wonder why there is a
> > significant
> > > > > write performance degradation from 3.4.14 to 3.6.2 and how we can
> > > address
> > > > > the issue.
> > > > >
> > > > > Best,
> > > > >
> > > > > Li
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org>
> > > wrote:
> > > > >
> > > > > > What is the CPU usage of both server and client during the test?
> > > > > >
> > > > > > Looks like server is dropping the clients because either the
> server
> > > or
> > > > > > both are too busy to deal with the load.
> > > > > > This log line is also concerning: "Too busy to snap, skipping”
> > > > > >
> > > > > > If that’s the case I believe you'll have to profile the server
> > > process
> > > > to
> > > > > > figure out where the perf bottleneck is.
> > > > > >
> > > > > > Andor
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
> > > > > > >
> > > > > > > Thanks, Patrick.
> > > > > > >
> > > > > > > Yes, we are using the same JVM version and GC configurations
> when
> > > > > > > running the two tests. I have checked the GC metrics and also
> the
> > > > heap
> > > > > > dump
> > > > > > > of the 3.6, the GC pause and the memory usage look okay.
> > > > > > >
> > > > > > > Best,
> > > > > > >
> > > > > > > Li
> > > > > > >
> > > > > > > On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <phunt@apache.org
> >
> > > > wrote:
> > > > > > >
> > > > > > >> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com>
> > > wrote:
> > > > > > >>
> > > > > > >>> Hi Enrico, Sushant,
> > > > > > >>>
> > > > > > >>> I re-run the perf test with the data consistency check
> feature
> > > > > disabled
> > > > > > >>> (i.e. -Dzookeeper.digest.enabled=false), the write
> performance
> > > > issue
> > > > > of
> > > > > > >> 3.6
> > > > > > >>> is still there.
> > > > > > >>>
> > > > > > >>> With everything exactly the same, the throughput of 3.6 was
> > only
> > > > 1/2
> > > > > of
> > > > > > >> 3.4
> > > > > > >>> and the max latency was more than 8 times.
> > > > > > >>>
> > > > > > >>> Any other points or thoughts?
> > > > > > >>>
> > > > > > >>>
> > > > > > >> In the past I've noticed a big impact of GC when doing certain
> > > > > > performance
> > > > > > >> measurements. I assume you are using the same JVM version and
> GC
> > > > when
> > > > > > >> running the two tests? Perhaps our memory footprint has
> expanded
> > > > over
> > > > > > time.
> > > > > > >> You should rule out GC by running with gc logging turned on
> with
> > > > both
> > > > > > >> versions and compare the impact.
> > > > > > >>
> > > > > > >> Regards,
> > > > > > >>
> > > > > > >> Patrick
> > > > > > >>
> > > > > > >>
> > > > > > >>> Cheers,
> > > > > > >>>
> > > > > > >>> Li
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com>
> > > wrote:
> > > > > > >>>
> > > > > > >>>> Thanks Sushant and Enrico!
> > > > > > >>>>
> > > > > > >>>> This is a really good point.  According to the 3.6
> > > documentation,
> > > > > the
> > > > > > >>>> feature is disabled by default.
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
> > > > > > >>> .
> > > > > > >>>> However, checking the code, the default is enabled.
> > > > > > >>>>
> > > > > > >>>> Let me set the zookeeper.digest.enabled to false and see how
> > the
> > > > > write
> > > > > > >>>> operation performs.
> > > > > > >>>>
> > > > > > >>>> Best,
> > > > > > >>>>
> > > > > > >>>> Li
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
> > > > > sushantmane7@gmail.com>
> > > > > > >>>> wrote:
> > > > > > >>>>
> > > > > > >>>>> Hi Li,
> > > > > > >>>>>
> > > > > > >>>>> On 3.6.2 consistency checker (adhash based) is enabled by
> > > > default:
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
> > > > > > >>>>> .
> > > > > > >>>>> It is not present in ZK 3.4.14.
> > > > > > >>>>>
> > > > > > >>>>> This feature does have some impact on write performance.
> > > > > > >>>>>
> > > > > > >>>>> Thanks,
> > > > > > >>>>> Sushant
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
> > > > > > eolivelli@gmail.com
> > > > > > >>>
> > > > > > >>>>> wrote:
> > > > > > >>>>>
> > > > > > >>>>>> Li,
> > > > > > >>>>>> I wonder of we have some new throttling/back pressure
> > > mechanisms
> > > > > > >> that
> > > > > > >>> is
> > > > > > >>>>>> enabled by default.
> > > > > > >>>>>>
> > > > > > >>>>>> Does anyone has some pointer to relevant implementations?
> > > > > > >>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>> Enrico
> > > > > > >>>>>>
> > > > > > >>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha
> > > > scritto:
> > > > > > >>>>>>
> > > > > > >>>>>>> Hi,
> > > > > > >>>>>>>
> > > > > > >>>>>>> We switched to Netty on both client side and server side
> > and
> > > > the
> > > > > > >>>>>>> performance issue is still there.  Anyone has any
> insights
> > on
> > > > > what
> > > > > > >>>>> could
> > > > > > >>>>>> be
> > > > > > >>>>>>> the cause of higher latency?
> > > > > > >>>>>>>
> > > > > > >>>>>>> Thanks,
> > > > > > >>>>>>>
> > > > > > >>>>>>> Li
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <
> li4wang@gmail.com
> > >
> > > > > > >> wrote:
> > > > > > >>>>>>>
> > > > > > >>>>>>>> Hi Enrico,
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Thanks for the reply.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 2. Yes, here are some metrics on the client side.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency:
> 57ms,
> > > > Max
> > > > > > >>>>> Latency
> > > > > > >>>>>>> 31s
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,
> Max
> > > > > > >>> Latency:
> > > > > > >>>>>> 1.6s
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 10G of Heap
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 13G of Memory
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 5 Participante
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 5 Observere
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Client session timeout: 3000ms
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Server min session time: 4000ms
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 4. Yes, there are two types of  WARN logs and many
> > “Expiring
> > > > > > >>>>> session”
> > > > > > >>>>>>>> INFO log
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> > > > > > >>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected
> > > exception
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> EndOfStreamException: Unable to read additional data
> from
> > > > > > >> client,
> > > > > > >>> it
> > > > > > >>>>>>>> probably closed the socket: address = /
> > 100.108.63.116:43366
> > > ,
> > > > > > >>>>> session =
> > > > > > >>>>>>>> 0x400189fee9a000b
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> at
> > > > > > >>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> at
> > > > > > >>>>>>
> > > > > > >>
> > > >
> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> at
> > > > > > >>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> at
> > > > > > >>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> at
> > > > > > >>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> at
> > > > > > >>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> > > > > > >>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to
> > snap,
> > > > > > >>>>> skipping
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> > > > > > >>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> > > > > > >>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
> > > > > > >> Actually,
> > > > > > >>>>> the
> > > > > > >>>>>>>> issue happened with the combinations of
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 3.4 client and 3.6 server
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 3.6 client and 3.6 server
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Please let me know if you need any additional info.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Thanks,
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Li
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <
> > li4wang@gmail.com>
> > > > > > >>> wrote:
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>> Hi Enrico,
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> Thanks for the reply.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> 1. We are using direct NIO based stack, not Netty based
> > > yet.
> > > > > > >>>>>>>>> 2. Yes, on the client side, here are the metrics
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> 3.6:
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
> > > > > > >>>>> eolivelli@gmail.com
> > > > > > >>>>>>>
> > > > > > >>>>>>>>> wrote:
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>> IIRC The main difference is about the switch to Netty
> 4
> > > and
> > > > > > >>> about
> > > > > > >>>>>> using
> > > > > > >>>>>>>>>> more DirectMemory. Are you using the Netty based
> stack?
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Apart from that macro difference there have been many
> > many
> > > > > > >>> changes
> > > > > > >>>>>>> since
> > > > > > >>>>>>>>>> 3.4.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Do you have some metrics to share?
> > > > > > >>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration
> > > equals
> > > > > > >> to
> > > > > > >>>>> each
> > > > > > >>>>>>>>>> other?
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Do you see warnings on the server logs?
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Did you upgrade both the client and the server or only
> > the
> > > > > > >>> server?
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Enrico
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com>
> > ha
> > > > > > >>> scritto:
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>> Hi,
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
> > > > > > >>>>> perform/load
> > > > > > >>>>>>>>>>> comparison test,  it was found that the performance
> of
> > > 3.6
> > > > > > >> has
> > > > > > >>>>> been
> > > > > > >>>>>>>>>>> significantly degraded compared to 3.4 for the write
> > > > > > >>> operation.
> > > > > > >>>>>> Under
> > > > > > >>>>>>>>>> the
> > > > > > >>>>>>>>>>> same load, there was a huge number of SessionExpired
> > and
> > > > > > >>>>>>> ConnectionLoss
> > > > > > >>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> The load testing is 500 concurrent users with a
> cluster
> > > of
> > > > 5
> > > > > > >>>>>>>>>> participants
> > > > > > >>>>>>>>>>> and 5 observers. The min session timeout on the
> server
> > > side
> > > > > > >> is
> > > > > > >>>>>>> 4000ms.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> I wonder if anyone has seen the same issue and has
> any
> > > > > > >>> insights
> > > > > > >>>>> on
> > > > > > >>>>>>> what
> > > > > > >>>>>>>>>>> could be the cause of the performance degradation.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> Thanks
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> Li
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: write performance issue in 3.6.2

Posted by Michael Han <ha...@apache.org>.

>> because the tests were run with Prometheus enabled,  which is new in 3.6
and has significant negative perf impact.

Interesting, let's see what the numbers are without Prometheus involved. It
could be that the increased latency we observed in CommitProcessor is just
a symptom rather than the cause if Prometheus is the culprit.

In my production environment we use an in house in-process metrics library
which hasn't caused any trouble for performance so far.

>> Does it measure how long a local write op takes in the commit processor
phase?

Yes.

>> I don't think the writes can be processed concurrently.

Correct. There can only be a single inflight write operation that's
committing at any given time. The concurrency is between a single write and
multiple reads where these reads belong to different sessions. Previously,
in 3.4, a single write will block processing of every other operation
including both writes and reads, and in 3.6, a single write will only block
reads that's from the same session; reads from other sessions can proceed
concurrently.

>> Is it correct to say that the new CommitProcessor works best for the
reads in the read/write workloads scenario,

Yes, as previously mentioned, a single write will now only
conditionally block other reads as opposed to unanimously block all reads
which increase overall throughput.


On Mon, May 3, 2021 at 5:06 PM Li Wang <li...@gmail.com> wrote:

> Hi Michael,
>
> Thanks for your additional inputs.
>
> On Mon, May 3, 2021 at 3:13 PM Michael Han <ha...@apache.org> wrote:
>
> > Hi Li,
> >
> > Thanks for following up.
> >
> > >> write_commitproc_time_ms were large
> >
> > This measures how long a local write op hears back from the leader. If
> it's
> > big, then either the leader is very busy acking the request, or your
> > network RTT is high.
> >
>
> Does it measure how long a local write op takes in the commit processor
> phase?
>
> ServerMetrics.getMetrics().WRITE_COMMITPROC_TIME.add(currentTime -
> request.commitProcQueueStartTime);
>
>
> > How does the local fsync time (fsynctime) look like between two tests?
> >
>
> The fsync time looks similar between two tests.
>
> >
> > >> We've found that increasing the maxCommitBatchSize
> >
> > Are you able to successfully tune this value so your benchmark against
> 3.6
> > is on par with 3.4 now? The original report mentioned lots of session
> > timeout and con loss. I am wondering if we can fix this first by tuning
> > this size.
> >
>
> We ran load tests with maxCommitBatchSize as 500 vs 1. 500 was used as we
> have 500 concurrent users in the load test.
> The connection loss error count was reduced about 40% and the session
> expired error was reduced about 45%.
> Tuning maxCommitBatchSize can significantly reduce the errors. We can not
> say the benchmark is on par with 3.4 (i.e. no errors)
> because the tests were run with Prometheus enabled,  which is new in 3.6
> and has significant negative perf impact.
> We will run tests with Prometheus disabled and maxCommitBatchSize as 500
> when we get a chance.
>
> >
> > The major difference of CommitProcessor between 3.4.14 and 3.6.2 is the
> > newly added per session commit queue such that reads and a write from
> > different sessions can be processed concurrently.
>
> Yes, I noticed that there is a queue per session in 3.6, but I don't think
> the writes can be processed concurrently.
> The CommitProcessor is single threaded and the CommitProc worker threads
> are only for reads. Did I miss anything?
>
> This works best for mixed
> > read / write workloads, but for pure write workloads, the new
> > CommitProcessor is not superior, as all writes still have to be
> serialized
> > due to global ordering, plus the per session queue has the overhead for
> > example in this test ZK has to manage 500 queues (and enqueue / dequeue
> and
> > so on cost cycles). Though, I didn't expect this overhead can create
> such a
> > big difference in your test..
> >
>
> Is it correct to say that the new CommitProcessor works best for the reads
> in the read/write workloads scenario,
> as only the reads can be processed concurrently?
>
>
> > Also this is obvious but just want to confirm if the benchmark for two
> > versions of ZK was done on exact same test environment including same OS
> > and networking configuration?
> >
>
> Yes, the benchmark for the two versions was done on the same test
> environment and configuration.
>
> >
> > On Mon, Apr 26, 2021 at 7:35 PM Li Wang <li...@gmail.com> wrote:
> >
> > > Hi Michael,
> > >
> > > Thanks for your reply.
> > >
> > > 1. The workload is 500 concurrent users creating nodes with data size
> of
> > 4
> > > bytes.
> > > 2. It's pure write
> > > 3. The perf issue is that under the same load, there were many session
> > > expired and connection loss errors when using ZK 3.6.2 but no such
> errors
> > > in ZK 3.4.14.
> > >
> > > The following are some updates on the issue.
> > >
> > > 1. We've checked the fine grained metrics and found that the
> > > CommitProcessor was the bottleneck. The commit_commit_proc_req_queued
> and
> > > the write_commitproc_time_ms were large.
> > > The errors were caused by too many commit requests queued up in the
> > > CommitProcessor and waiting to be processed.
> > > 2. We've found that increasing the maxCommitBatchSize can reduce both
> the
> > > session expired and connection loss errors.
> > > 3. We didn't observe any significant perf impact from the
> > RequestThrottler.
> > >
> > >
> > > Please let me know if you or anyone has any questions.
> > >
> > > Thanks,
> > >
> > > Li
> > >
> > >
> > >
> > > On Tue, Apr 20, 2021 at 8:03 PM Michael Han <ha...@apache.org> wrote:
> > >
> > > > What is the workload looking like? Is it pure write, or mixed read
> > write?
> > > >
> > > > A couple of ideas to move this forward:
> > > > * Publish the performance benchmark so the community can help.
> > > > * Bisect git commit and find the bad commit that caused the
> regression.
> > > > * Use the fine grained metrics introduced in 3.6 (e.g per processor
> > stage
> > > > metrics) to measure where time spends during writes. We might have to
> > add
> > > > these metrics on 3.4 to get a fair comparison.
> > > >
> > > > For the throttling - the RequestThrottler introduced in 3.6 does
> > > introduce
> > > > latency, but should not impact throughput this much.
> > > >
> > > > On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:
> > > >
> > > > > The CPU usage of both server and client are normal (< 50%) during
> the
> > > > test.
> > > > >
> > > > > Based on the investigation, the server is too busy with the load.
> > > > >
> > > > > The issue doesn't exist in 3.4.14. I wonder why there is a
> > significant
> > > > > write performance degradation from 3.4.14 to 3.6.2 and how we can
> > > address
> > > > > the issue.
> > > > >
> > > > > Best,
> > > > >
> > > > > Li
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org>
> > > wrote:
> > > > >
> > > > > > What is the CPU usage of both server and client during the test?
> > > > > >
> > > > > > Looks like server is dropping the clients because either the
> server
> > > or
> > > > > > both are too busy to deal with the load.
> > > > > > This log line is also concerning: "Too busy to snap, skipping”
> > > > > >
> > > > > > If that’s the case I believe you'll have to profile the server
> > > process
> > > > to
> > > > > > figure out where the perf bottleneck is.
> > > > > >
> > > > > > Andor
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
> > > > > > >
> > > > > > > Thanks, Patrick.
> > > > > > >
> > > > > > > Yes, we are using the same JVM version and GC configurations
> when
> > > > > > > running the two tests. I have checked the GC metrics and also
> the
> > > > heap
> > > > > > dump
> > > > > > > of the 3.6, the GC pause and the memory usage look okay.
> > > > > > >
> > > > > > > Best,
> > > > > > >
> > > > > > > Li
> > > > > > >
> > > > > > > On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <phunt@apache.org
> >
> > > > wrote:
> > > > > > >
> > > > > > >> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com>
> > > wrote:
> > > > > > >>
> > > > > > >>> Hi Enrico, Sushant,
> > > > > > >>>
> > > > > > >>> I re-run the perf test with the data consistency check
> feature
> > > > > disabled
> > > > > > >>> (i.e. -Dzookeeper.digest.enabled=false), the write
> performance
> > > > issue
> > > > > of
> > > > > > >> 3.6
> > > > > > >>> is still there.
> > > > > > >>>
> > > > > > >>> With everything exactly the same, the throughput of 3.6 was
> > only
> > > > 1/2
> > > > > of
> > > > > > >> 3.4
> > > > > > >>> and the max latency was more than 8 times.
> > > > > > >>>
> > > > > > >>> Any other points or thoughts?
> > > > > > >>>
> > > > > > >>>
> > > > > > >> In the past I've noticed a big impact of GC when doing certain
> > > > > > performance
> > > > > > >> measurements. I assume you are using the same JVM version and
> GC
> > > > when
> > > > > > >> running the two tests? Perhaps our memory footprint has
> expanded
> > > > over
> > > > > > time.
> > > > > > >> You should rule out GC by running with gc logging turned on
> with
> > > > both
> > > > > > >> versions and compare the impact.
> > > > > > >>
> > > > > > >> Regards,
> > > > > > >>
> > > > > > >> Patrick
> > > > > > >>
> > > > > > >>
> > > > > > >>> Cheers,
> > > > > > >>>
> > > > > > >>> Li
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com>
> > > wrote:
> > > > > > >>>
> > > > > > >>>> Thanks Sushant and Enrico!
> > > > > > >>>>
> > > > > > >>>> This is a really good point.  According to the 3.6
> > > documentation,
> > > > > the
> > > > > > >>>> feature is disabled by default.
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
> > > > > > >>> .
> > > > > > >>>> However, checking the code, the default is enabled.
> > > > > > >>>>
> > > > > > >>>> Let me set the zookeeper.digest.enabled to false and see how
> > the
> > > > > write
> > > > > > >>>> operation performs.
> > > > > > >>>>
> > > > > > >>>> Best,
> > > > > > >>>>
> > > > > > >>>> Li
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
> > > > > sushantmane7@gmail.com>
> > > > > > >>>> wrote:
> > > > > > >>>>
> > > > > > >>>>> Hi Li,
> > > > > > >>>>>
> > > > > > >>>>> On 3.6.2 consistency checker (adhash based) is enabled by
> > > > default:
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
> > > > > > >>>>> .
> > > > > > >>>>> It is not present in ZK 3.4.14.
> > > > > > >>>>>
> > > > > > >>>>> This feature does have some impact on write performance.
> > > > > > >>>>>
> > > > > > >>>>> Thanks,
> > > > > > >>>>> Sushant
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
> > > > > > eolivelli@gmail.com
> > > > > > >>>
> > > > > > >>>>> wrote:
> > > > > > >>>>>
> > > > > > >>>>>> Li,
> > > > > > >>>>>> I wonder of we have some new throttling/back pressure
> > > mechanisms
> > > > > > >> that
> > > > > > >>> is
> > > > > > >>>>>> enabled by default.
> > > > > > >>>>>>
> > > > > > >>>>>> Does anyone has some pointer to relevant implementations?
> > > > > > >>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>> Enrico
> > > > > > >>>>>>
> > > > > > >>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha
> > > > scritto:
> > > > > > >>>>>>
> > > > > > >>>>>>> Hi,
> > > > > > >>>>>>>
> > > > > > >>>>>>> We switched to Netty on both client side and server side
> > and
> > > > the
> > > > > > >>>>>>> performance issue is still there.  Anyone has any
> insights
> > on
> > > > > what
> > > > > > >>>>> could
> > > > > > >>>>>> be
> > > > > > >>>>>>> the cause of higher latency?
> > > > > > >>>>>>>
> > > > > > >>>>>>> Thanks,
> > > > > > >>>>>>>
> > > > > > >>>>>>> Li
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <
> li4wang@gmail.com
> > >
> > > > > > >> wrote:
> > > > > > >>>>>>>
> > > > > > >>>>>>>> Hi Enrico,
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Thanks for the reply.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 2. Yes, here are some metrics on the client side.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency:
> 57ms,
> > > > Max
> > > > > > >>>>> Latency
> > > > > > >>>>>>> 31s
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,
> Max
> > > > > > >>> Latency:
> > > > > > >>>>>> 1.6s
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 10G of Heap
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 13G of Memory
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 5 Participante
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 5 Observere
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Client session timeout: 3000ms
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Server min session time: 4000ms
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 4. Yes, there are two types of  WARN logs and many
> > “Expiring
> > > > > > >>>>> session”
> > > > > > >>>>>>>> INFO log
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> > > > > > >>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected
> > > exception
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> EndOfStreamException: Unable to read additional data
> from
> > > > > > >> client,
> > > > > > >>> it
> > > > > > >>>>>>>> probably closed the socket: address = /
> > 100.108.63.116:43366
> > > ,
> > > > > > >>>>> session =
> > > > > > >>>>>>>> 0x400189fee9a000b
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> at
> > > > > > >>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> at
> > > > > > >>>>>>
> > > > > > >>
> > > >
> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> at
> > > > > > >>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> at
> > > > > > >>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> at
> > > > > > >>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> at
> > > > > > >>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> > > > > > >>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to
> > snap,
> > > > > > >>>>> skipping
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> > > > > > >>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> > > > > > >>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
> > > > > > >> Actually,
> > > > > > >>>>> the
> > > > > > >>>>>>>> issue happened with the combinations of
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 3.4 client and 3.6 server
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 3.6 client and 3.6 server
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Please let me know if you need any additional info.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Thanks,
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Li
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <
> > li4wang@gmail.com>
> > > > > > >>> wrote:
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>> Hi Enrico,
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> Thanks for the reply.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> 1. We are using direct NIO based stack, not Netty based
> > > yet.
> > > > > > >>>>>>>>> 2. Yes, on the client side, here are the metrics
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> 3.6:
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
> > > > > > >>>>> eolivelli@gmail.com
> > > > > > >>>>>>>
> > > > > > >>>>>>>>> wrote:
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>> IIRC The main difference is about the switch to Netty
> 4
> > > and
> > > > > > >>> about
> > > > > > >>>>>> using
> > > > > > >>>>>>>>>> more DirectMemory. Are you using the Netty based
> stack?
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Apart from that macro difference there have been many
> > many
> > > > > > >>> changes
> > > > > > >>>>>>> since
> > > > > > >>>>>>>>>> 3.4.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Do you have some metrics to share?
> > > > > > >>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration
> > > equals
> > > > > > >> to
> > > > > > >>>>> each
> > > > > > >>>>>>>>>> other?
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Do you see warnings on the server logs?
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Did you upgrade both the client and the server or only
> > the
> > > > > > >>> server?
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Enrico
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com>
> > ha
> > > > > > >>> scritto:
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>> Hi,
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
> > > > > > >>>>> perform/load
> > > > > > >>>>>>>>>>> comparison test,  it was found that the performance
> of
> > > 3.6
> > > > > > >> has
> > > > > > >>>>> been
> > > > > > >>>>>>>>>>> significantly degraded compared to 3.4 for the write
> > > > > > >>> operation.
> > > > > > >>>>>> Under
> > > > > > >>>>>>>>>> the
> > > > > > >>>>>>>>>>> same load, there was a huge number of SessionExpired
> > and
> > > > > > >>>>>>> ConnectionLoss
> > > > > > >>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> The load testing is 500 concurrent users with a
> cluster
> > > of
> > > > 5
> > > > > > >>>>>>>>>> participants
> > > > > > >>>>>>>>>>> and 5 observers. The min session timeout on the
> server
> > > side
> > > > > > >> is
> > > > > > >>>>>>> 4000ms.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> I wonder if anyone has seen the same issue and has
> any
> > > > > > >>> insights
> > > > > > >>>>> on
> > > > > > >>>>>>> what
> > > > > > >>>>>>>>>>> could be the cause of the performance degradation.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> Thanks
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> Li
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: write performance issue in 3.6.2

Posted by Li Wang <li...@gmail.com>.

Hi Michael,

Thanks for your additional inputs.

On Mon, May 3, 2021 at 3:13 PM Michael Han <ha...@apache.org> wrote:

> Hi Li,
>
> Thanks for following up.
>
> >> write_commitproc_time_ms were large
>
> This measures how long a local write op hears back from the leader. If it's
> big, then either the leader is very busy acking the request, or your
> network RTT is high.
>

Does it measure how long a local write op takes in the commit processor
phase?

ServerMetrics.getMetrics().WRITE_COMMITPROC_TIME.add(currentTime -
request.commitProcQueueStartTime);


> How does the local fsync time (fsynctime) look like between two tests?
>

The fsync time looks similar between two tests.

>
> >> We've found that increasing the maxCommitBatchSize
>
> Are you able to successfully tune this value so your benchmark against 3.6
> is on par with 3.4 now? The original report mentioned lots of session
> timeout and con loss. I am wondering if we can fix this first by tuning
> this size.
>

We ran load tests with maxCommitBatchSize as 500 vs 1. 500 was used as we
have 500 concurrent users in the load test.
The connection loss error count was reduced about 40% and the session
expired error was reduced about 45%.
Tuning maxCommitBatchSize can significantly reduce the errors. We can not
say the benchmark is on par with 3.4 (i.e. no errors)
because the tests were run with Prometheus enabled,  which is new in 3.6
and has significant negative perf impact.
We will run tests with Prometheus disabled and maxCommitBatchSize as 500
when we get a chance.

>
> The major difference of CommitProcessor between 3.4.14 and 3.6.2 is the
> newly added per session commit queue such that reads and a write from
> different sessions can be processed concurrently.

Yes, I noticed that there is a queue per session in 3.6, but I don't think
the writes can be processed concurrently.
The CommitProcessor is single threaded and the CommitProc worker threads
are only for reads. Did I miss anything?

This works best for mixed
> read / write workloads, but for pure write workloads, the new
> CommitProcessor is not superior, as all writes still have to be serialized
> due to global ordering, plus the per session queue has the overhead for
> example in this test ZK has to manage 500 queues (and enqueue / dequeue and
> so on cost cycles). Though, I didn't expect this overhead can create such a
> big difference in your test..
>

Is it correct to say that the new CommitProcessor works best for the reads
in the read/write workloads scenario,
as only the reads can be processed concurrently?


> Also this is obvious but just want to confirm if the benchmark for two
> versions of ZK was done on exact same test environment including same OS
> and networking configuration?
>

Yes, the benchmark for the two versions was done on the same test
environment and configuration.

>
> On Mon, Apr 26, 2021 at 7:35 PM Li Wang <li...@gmail.com> wrote:
>
> > Hi Michael,
> >
> > Thanks for your reply.
> >
> > 1. The workload is 500 concurrent users creating nodes with data size of
> 4
> > bytes.
> > 2. It's pure write
> > 3. The perf issue is that under the same load, there were many session
> > expired and connection loss errors when using ZK 3.6.2 but no such errors
> > in ZK 3.4.14.
> >
> > The following are some updates on the issue.
> >
> > 1. We've checked the fine grained metrics and found that the
> > CommitProcessor was the bottleneck. The commit_commit_proc_req_queued and
> > the write_commitproc_time_ms were large.
> > The errors were caused by too many commit requests queued up in the
> > CommitProcessor and waiting to be processed.
> > 2. We've found that increasing the maxCommitBatchSize can reduce both the
> > session expired and connection loss errors.
> > 3. We didn't observe any significant perf impact from the
> RequestThrottler.
> >
> >
> > Please let me know if you or anyone has any questions.
> >
> > Thanks,
> >
> > Li
> >
> >
> >
> > On Tue, Apr 20, 2021 at 8:03 PM Michael Han <ha...@apache.org> wrote:
> >
> > > What is the workload looking like? Is it pure write, or mixed read
> write?
> > >
> > > A couple of ideas to move this forward:
> > > * Publish the performance benchmark so the community can help.
> > > * Bisect git commit and find the bad commit that caused the regression.
> > > * Use the fine grained metrics introduced in 3.6 (e.g per processor
> stage
> > > metrics) to measure where time spends during writes. We might have to
> add
> > > these metrics on 3.4 to get a fair comparison.
> > >
> > > For the throttling - the RequestThrottler introduced in 3.6 does
> > introduce
> > > latency, but should not impact throughput this much.
> > >
> > > On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:
> > >
> > > > The CPU usage of both server and client are normal (< 50%) during the
> > > test.
> > > >
> > > > Based on the investigation, the server is too busy with the load.
> > > >
> > > > The issue doesn't exist in 3.4.14. I wonder why there is a
> significant
> > > > write performance degradation from 3.4.14 to 3.6.2 and how we can
> > address
> > > > the issue.
> > > >
> > > > Best,
> > > >
> > > > Li
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org>
> > wrote:
> > > >
> > > > > What is the CPU usage of both server and client during the test?
> > > > >
> > > > > Looks like server is dropping the clients because either the server
> > or
> > > > > both are too busy to deal with the load.
> > > > > This log line is also concerning: "Too busy to snap, skipping”
> > > > >
> > > > > If that’s the case I believe you'll have to profile the server
> > process
> > > to
> > > > > figure out where the perf bottleneck is.
> > > > >
> > > > > Andor
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > > On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
> > > > > >
> > > > > > Thanks, Patrick.
> > > > > >
> > > > > > Yes, we are using the same JVM version and GC configurations when
> > > > > > running the two tests. I have checked the GC metrics and also the
> > > heap
> > > > > dump
> > > > > > of the 3.6, the GC pause and the memory usage look okay.
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > Li
> > > > > >
> > > > > > On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@apache.org>
> > > wrote:
> > > > > >
> > > > > >> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com>
> > wrote:
> > > > > >>
> > > > > >>> Hi Enrico, Sushant,
> > > > > >>>
> > > > > >>> I re-run the perf test with the data consistency check feature
> > > > disabled
> > > > > >>> (i.e. -Dzookeeper.digest.enabled=false), the write performance
> > > issue
> > > > of
> > > > > >> 3.6
> > > > > >>> is still there.
> > > > > >>>
> > > > > >>> With everything exactly the same, the throughput of 3.6 was
> only
> > > 1/2
> > > > of
> > > > > >> 3.4
> > > > > >>> and the max latency was more than 8 times.
> > > > > >>>
> > > > > >>> Any other points or thoughts?
> > > > > >>>
> > > > > >>>
> > > > > >> In the past I've noticed a big impact of GC when doing certain
> > > > > performance
> > > > > >> measurements. I assume you are using the same JVM version and GC
> > > when
> > > > > >> running the two tests? Perhaps our memory footprint has expanded
> > > over
> > > > > time.
> > > > > >> You should rule out GC by running with gc logging turned on with
> > > both
> > > > > >> versions and compare the impact.
> > > > > >>
> > > > > >> Regards,
> > > > > >>
> > > > > >> Patrick
> > > > > >>
> > > > > >>
> > > > > >>> Cheers,
> > > > > >>>
> > > > > >>> Li
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com>
> > wrote:
> > > > > >>>
> > > > > >>>> Thanks Sushant and Enrico!
> > > > > >>>>
> > > > > >>>> This is a really good point.  According to the 3.6
> > documentation,
> > > > the
> > > > > >>>> feature is disabled by default.
> > > > > >>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
> > > > > >>> .
> > > > > >>>> However, checking the code, the default is enabled.
> > > > > >>>>
> > > > > >>>> Let me set the zookeeper.digest.enabled to false and see how
> the
> > > > write
> > > > > >>>> operation performs.
> > > > > >>>>
> > > > > >>>> Best,
> > > > > >>>>
> > > > > >>>> Li
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
> > > > sushantmane7@gmail.com>
> > > > > >>>> wrote:
> > > > > >>>>
> > > > > >>>>> Hi Li,
> > > > > >>>>>
> > > > > >>>>> On 3.6.2 consistency checker (adhash based) is enabled by
> > > default:
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
> > > > > >>>>> .
> > > > > >>>>> It is not present in ZK 3.4.14.
> > > > > >>>>>
> > > > > >>>>> This feature does have some impact on write performance.
> > > > > >>>>>
> > > > > >>>>> Thanks,
> > > > > >>>>> Sushant
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
> > > > > eolivelli@gmail.com
> > > > > >>>
> > > > > >>>>> wrote:
> > > > > >>>>>
> > > > > >>>>>> Li,
> > > > > >>>>>> I wonder of we have some new throttling/back pressure
> > mechanisms
> > > > > >> that
> > > > > >>> is
> > > > > >>>>>> enabled by default.
> > > > > >>>>>>
> > > > > >>>>>> Does anyone has some pointer to relevant implementations?
> > > > > >>>>>>
> > > > > >>>>>>
> > > > > >>>>>> Enrico
> > > > > >>>>>>
> > > > > >>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha
> > > scritto:
> > > > > >>>>>>
> > > > > >>>>>>> Hi,
> > > > > >>>>>>>
> > > > > >>>>>>> We switched to Netty on both client side and server side
> and
> > > the
> > > > > >>>>>>> performance issue is still there.  Anyone has any insights
> on
> > > > what
> > > > > >>>>> could
> > > > > >>>>>> be
> > > > > >>>>>>> the cause of higher latency?
> > > > > >>>>>>>
> > > > > >>>>>>> Thanks,
> > > > > >>>>>>>
> > > > > >>>>>>> Li
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li4wang@gmail.com
> >
> > > > > >> wrote:
> > > > > >>>>>>>
> > > > > >>>>>>>> Hi Enrico,
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> Thanks for the reply.
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> > > > > >>>>>>>>
> > > > > >>>>>>>> 2. Yes, here are some metrics on the client side.
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,
> > > Max
> > > > > >>>>> Latency
> > > > > >>>>>>> 31s
> > > > > >>>>>>>>
> > > > > >>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max
> > > > > >>> Latency:
> > > > > >>>>>> 1.6s
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> > > > > >>>>>>>>
> > > > > >>>>>>>> 10G of Heap
> > > > > >>>>>>>>
> > > > > >>>>>>>> 13G of Memory
> > > > > >>>>>>>>
> > > > > >>>>>>>> 5 Participante
> > > > > >>>>>>>>
> > > > > >>>>>>>> 5 Observere
> > > > > >>>>>>>>
> > > > > >>>>>>>> Client session timeout: 3000ms
> > > > > >>>>>>>>
> > > > > >>>>>>>> Server min session time: 4000ms
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 4. Yes, there are two types of  WARN logs and many
> “Expiring
> > > > > >>>>> session”
> > > > > >>>>>>>> INFO log
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> > > > > >>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected
> > exception
> > > > > >>>>>>>>
> > > > > >>>>>>>> EndOfStreamException: Unable to read additional data from
> > > > > >> client,
> > > > > >>> it
> > > > > >>>>>>>> probably closed the socket: address = /
> 100.108.63.116:43366
> > ,
> > > > > >>>>> session =
> > > > > >>>>>>>> 0x400189fee9a000b
> > > > > >>>>>>>>
> > > > > >>>>>>>> at
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
> > > > > >>>>>>>>
> > > > > >>>>>>>> at
> > > > > >>>>>>
> > > > > >>
> > > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
> > > > > >>>>>>>>
> > > > > >>>>>>>> at
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
> > > > > >>>>>>>>
> > > > > >>>>>>>> at
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
> > > > > >>>>>>>>
> > > > > >>>>>>>> at
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> > > > > >>>>>>>>
> > > > > >>>>>>>> at
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> > > > > >>>>>>>>
> > > > > >>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> > > > > >>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to
> snap,
> > > > > >>>>> skipping
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> > > > > >>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> > > > > >>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
> > > > > >> Actually,
> > > > > >>>>> the
> > > > > >>>>>>>> issue happened with the combinations of
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 3.4 client and 3.6 server
> > > > > >>>>>>>>
> > > > > >>>>>>>> 3.6 client and 3.6 server
> > > > > >>>>>>>>
> > > > > >>>>>>>> Please let me know if you need any additional info.
> > > > > >>>>>>>>
> > > > > >>>>>>>> Thanks,
> > > > > >>>>>>>>
> > > > > >>>>>>>> Li
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <
> li4wang@gmail.com>
> > > > > >>> wrote:
> > > > > >>>>>>>>
> > > > > >>>>>>>>> Hi Enrico,
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Thanks for the reply.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> 1. We are using direct NIO based stack, not Netty based
> > yet.
> > > > > >>>>>>>>> 2. Yes, on the client side, here are the metrics
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> 3.6:
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
> > > > > >>>>> eolivelli@gmail.com
> > > > > >>>>>>>
> > > > > >>>>>>>>> wrote:
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>> IIRC The main difference is about the switch to Netty 4
> > and
> > > > > >>> about
> > > > > >>>>>> using
> > > > > >>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Apart from that macro difference there have been many
> many
> > > > > >>> changes
> > > > > >>>>>>> since
> > > > > >>>>>>>>>> 3.4.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Do you have some metrics to share?
> > > > > >>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration
> > equals
> > > > > >> to
> > > > > >>>>> each
> > > > > >>>>>>>>>> other?
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Do you see warnings on the server logs?
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Did you upgrade both the client and the server or only
> the
> > > > > >>> server?
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Enrico
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com>
> ha
> > > > > >>> scritto:
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>> Hi,
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
> > > > > >>>>> perform/load
> > > > > >>>>>>>>>>> comparison test,  it was found that the performance of
> > 3.6
> > > > > >> has
> > > > > >>>>> been
> > > > > >>>>>>>>>>> significantly degraded compared to 3.4 for the write
> > > > > >>> operation.
> > > > > >>>>>> Under
> > > > > >>>>>>>>>> the
> > > > > >>>>>>>>>>> same load, there was a huge number of SessionExpired
> and
> > > > > >>>>>>> ConnectionLoss
> > > > > >>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> The load testing is 500 concurrent users with a cluster
> > of
> > > 5
> > > > > >>>>>>>>>> participants
> > > > > >>>>>>>>>>> and 5 observers. The min session timeout on the server
> > side
> > > > > >> is
> > > > > >>>>>>> 4000ms.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> I wonder if anyone has seen the same issue and has any
> > > > > >>> insights
> > > > > >>>>> on
> > > > > >>>>>>> what
> > > > > >>>>>>>>>>> could be the cause of the performance degradation.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> Thanks
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> Li
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: write performance issue in 3.6.2

Posted by Li Wang <li...@gmail.com>.

Hi Michael,

Thanks for your additional inputs.

On Mon, May 3, 2021 at 3:13 PM Michael Han <ha...@apache.org> wrote:

> Hi Li,
>
> Thanks for following up.
>
> >> write_commitproc_time_ms were large
>
> This measures how long a local write op hears back from the leader. If it's
> big, then either the leader is very busy acking the request, or your
> network RTT is high.
>

Does it measure how long a local write op takes in the commit processor
phase?

ServerMetrics.getMetrics().WRITE_COMMITPROC_TIME.add(currentTime -
request.commitProcQueueStartTime);


> How does the local fsync time (fsynctime) look like between two tests?
>

The fsync time looks similar between two tests.

>
> >> We've found that increasing the maxCommitBatchSize
>
> Are you able to successfully tune this value so your benchmark against 3.6
> is on par with 3.4 now? The original report mentioned lots of session
> timeout and con loss. I am wondering if we can fix this first by tuning
> this size.
>

We ran load tests with maxCommitBatchSize as 500 vs 1. 500 was used as we
have 500 concurrent users in the load test.
The connection loss error count was reduced about 40% and the session
expired error was reduced about 45%.
Tuning maxCommitBatchSize can significantly reduce the errors. We can not
say the benchmark is on par with 3.4 (i.e. no errors)
because the tests were run with Prometheus enabled,  which is new in 3.6
and has significant negative perf impact.
We will run tests with Prometheus disabled and maxCommitBatchSize as 500
when we get a chance.

>
> The major difference of CommitProcessor between 3.4.14 and 3.6.2 is the
> newly added per session commit queue such that reads and a write from
> different sessions can be processed concurrently.

Yes, I noticed that there is a queue per session in 3.6, but I don't think
the writes can be processed concurrently.
The CommitProcessor is single threaded and the CommitProc worker threads
are only for reads. Did I miss anything?

This works best for mixed
> read / write workloads, but for pure write workloads, the new
> CommitProcessor is not superior, as all writes still have to be serialized
> due to global ordering, plus the per session queue has the overhead for
> example in this test ZK has to manage 500 queues (and enqueue / dequeue and
> so on cost cycles). Though, I didn't expect this overhead can create such a
> big difference in your test..
>

Is it correct to say that the new CommitProcessor works best for the reads
in the read/write workloads scenario,
as only the reads can be processed concurrently?


> Also this is obvious but just want to confirm if the benchmark for two
> versions of ZK was done on exact same test environment including same OS
> and networking configuration?
>

Yes, the benchmark for the two versions was done on the same test
environment and configuration.

>
> On Mon, Apr 26, 2021 at 7:35 PM Li Wang <li...@gmail.com> wrote:
>
> > Hi Michael,
> >
> > Thanks for your reply.
> >
> > 1. The workload is 500 concurrent users creating nodes with data size of
> 4
> > bytes.
> > 2. It's pure write
> > 3. The perf issue is that under the same load, there were many session
> > expired and connection loss errors when using ZK 3.6.2 but no such errors
> > in ZK 3.4.14.
> >
> > The following are some updates on the issue.
> >
> > 1. We've checked the fine grained metrics and found that the
> > CommitProcessor was the bottleneck. The commit_commit_proc_req_queued and
> > the write_commitproc_time_ms were large.
> > The errors were caused by too many commit requests queued up in the
> > CommitProcessor and waiting to be processed.
> > 2. We've found that increasing the maxCommitBatchSize can reduce both the
> > session expired and connection loss errors.
> > 3. We didn't observe any significant perf impact from the
> RequestThrottler.
> >
> >
> > Please let me know if you or anyone has any questions.
> >
> > Thanks,
> >
> > Li
> >
> >
> >
> > On Tue, Apr 20, 2021 at 8:03 PM Michael Han <ha...@apache.org> wrote:
> >
> > > What is the workload looking like? Is it pure write, or mixed read
> write?
> > >
> > > A couple of ideas to move this forward:
> > > * Publish the performance benchmark so the community can help.
> > > * Bisect git commit and find the bad commit that caused the regression.
> > > * Use the fine grained metrics introduced in 3.6 (e.g per processor
> stage
> > > metrics) to measure where time spends during writes. We might have to
> add
> > > these metrics on 3.4 to get a fair comparison.
> > >
> > > For the throttling - the RequestThrottler introduced in 3.6 does
> > introduce
> > > latency, but should not impact throughput this much.
> > >
> > > On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:
> > >
> > > > The CPU usage of both server and client are normal (< 50%) during the
> > > test.
> > > >
> > > > Based on the investigation, the server is too busy with the load.
> > > >
> > > > The issue doesn't exist in 3.4.14. I wonder why there is a
> significant
> > > > write performance degradation from 3.4.14 to 3.6.2 and how we can
> > address
> > > > the issue.
> > > >
> > > > Best,
> > > >
> > > > Li
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org>
> > wrote:
> > > >
> > > > > What is the CPU usage of both server and client during the test?
> > > > >
> > > > > Looks like server is dropping the clients because either the server
> > or
> > > > > both are too busy to deal with the load.
> > > > > This log line is also concerning: "Too busy to snap, skipping”
> > > > >
> > > > > If that’s the case I believe you'll have to profile the server
> > process
> > > to
> > > > > figure out where the perf bottleneck is.
> > > > >
> > > > > Andor
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > > On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
> > > > > >
> > > > > > Thanks, Patrick.
> > > > > >
> > > > > > Yes, we are using the same JVM version and GC configurations when
> > > > > > running the two tests. I have checked the GC metrics and also the
> > > heap
> > > > > dump
> > > > > > of the 3.6, the GC pause and the memory usage look okay.
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > Li
> > > > > >
> > > > > > On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@apache.org>
> > > wrote:
> > > > > >
> > > > > >> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com>
> > wrote:
> > > > > >>
> > > > > >>> Hi Enrico, Sushant,
> > > > > >>>
> > > > > >>> I re-run the perf test with the data consistency check feature
> > > > disabled
> > > > > >>> (i.e. -Dzookeeper.digest.enabled=false), the write performance
> > > issue
> > > > of
> > > > > >> 3.6
> > > > > >>> is still there.
> > > > > >>>
> > > > > >>> With everything exactly the same, the throughput of 3.6 was
> only
> > > 1/2
> > > > of
> > > > > >> 3.4
> > > > > >>> and the max latency was more than 8 times.
> > > > > >>>
> > > > > >>> Any other points or thoughts?
> > > > > >>>
> > > > > >>>
> > > > > >> In the past I've noticed a big impact of GC when doing certain
> > > > > performance
> > > > > >> measurements. I assume you are using the same JVM version and GC
> > > when
> > > > > >> running the two tests? Perhaps our memory footprint has expanded
> > > over
> > > > > time.
> > > > > >> You should rule out GC by running with gc logging turned on with
> > > both
> > > > > >> versions and compare the impact.
> > > > > >>
> > > > > >> Regards,
> > > > > >>
> > > > > >> Patrick
> > > > > >>
> > > > > >>
> > > > > >>> Cheers,
> > > > > >>>
> > > > > >>> Li
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com>
> > wrote:
> > > > > >>>
> > > > > >>>> Thanks Sushant and Enrico!
> > > > > >>>>
> > > > > >>>> This is a really good point.  According to the 3.6
> > documentation,
> > > > the
> > > > > >>>> feature is disabled by default.
> > > > > >>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
> > > > > >>> .
> > > > > >>>> However, checking the code, the default is enabled.
> > > > > >>>>
> > > > > >>>> Let me set the zookeeper.digest.enabled to false and see how
> the
> > > > write
> > > > > >>>> operation performs.
> > > > > >>>>
> > > > > >>>> Best,
> > > > > >>>>
> > > > > >>>> Li
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
> > > > sushantmane7@gmail.com>
> > > > > >>>> wrote:
> > > > > >>>>
> > > > > >>>>> Hi Li,
> > > > > >>>>>
> > > > > >>>>> On 3.6.2 consistency checker (adhash based) is enabled by
> > > default:
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
> > > > > >>>>> .
> > > > > >>>>> It is not present in ZK 3.4.14.
> > > > > >>>>>
> > > > > >>>>> This feature does have some impact on write performance.
> > > > > >>>>>
> > > > > >>>>> Thanks,
> > > > > >>>>> Sushant
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
> > > > > eolivelli@gmail.com
> > > > > >>>
> > > > > >>>>> wrote:
> > > > > >>>>>
> > > > > >>>>>> Li,
> > > > > >>>>>> I wonder of we have some new throttling/back pressure
> > mechanisms
> > > > > >> that
> > > > > >>> is
> > > > > >>>>>> enabled by default.
> > > > > >>>>>>
> > > > > >>>>>> Does anyone has some pointer to relevant implementations?
> > > > > >>>>>>
> > > > > >>>>>>
> > > > > >>>>>> Enrico
> > > > > >>>>>>
> > > > > >>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha
> > > scritto:
> > > > > >>>>>>
> > > > > >>>>>>> Hi,
> > > > > >>>>>>>
> > > > > >>>>>>> We switched to Netty on both client side and server side
> and
> > > the
> > > > > >>>>>>> performance issue is still there.  Anyone has any insights
> on
> > > > what
> > > > > >>>>> could
> > > > > >>>>>> be
> > > > > >>>>>>> the cause of higher latency?
> > > > > >>>>>>>
> > > > > >>>>>>> Thanks,
> > > > > >>>>>>>
> > > > > >>>>>>> Li
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li4wang@gmail.com
> >
> > > > > >> wrote:
> > > > > >>>>>>>
> > > > > >>>>>>>> Hi Enrico,
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> Thanks for the reply.
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> > > > > >>>>>>>>
> > > > > >>>>>>>> 2. Yes, here are some metrics on the client side.
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,
> > > Max
> > > > > >>>>> Latency
> > > > > >>>>>>> 31s
> > > > > >>>>>>>>
> > > > > >>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max
> > > > > >>> Latency:
> > > > > >>>>>> 1.6s
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> > > > > >>>>>>>>
> > > > > >>>>>>>> 10G of Heap
> > > > > >>>>>>>>
> > > > > >>>>>>>> 13G of Memory
> > > > > >>>>>>>>
> > > > > >>>>>>>> 5 Participante
> > > > > >>>>>>>>
> > > > > >>>>>>>> 5 Observere
> > > > > >>>>>>>>
> > > > > >>>>>>>> Client session timeout: 3000ms
> > > > > >>>>>>>>
> > > > > >>>>>>>> Server min session time: 4000ms
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 4. Yes, there are two types of  WARN logs and many
> “Expiring
> > > > > >>>>> session”
> > > > > >>>>>>>> INFO log
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> > > > > >>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected
> > exception
> > > > > >>>>>>>>
> > > > > >>>>>>>> EndOfStreamException: Unable to read additional data from
> > > > > >> client,
> > > > > >>> it
> > > > > >>>>>>>> probably closed the socket: address = /
> 100.108.63.116:43366
> > ,
> > > > > >>>>> session =
> > > > > >>>>>>>> 0x400189fee9a000b
> > > > > >>>>>>>>
> > > > > >>>>>>>> at
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
> > > > > >>>>>>>>
> > > > > >>>>>>>> at
> > > > > >>>>>>
> > > > > >>
> > > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
> > > > > >>>>>>>>
> > > > > >>>>>>>> at
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
> > > > > >>>>>>>>
> > > > > >>>>>>>> at
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
> > > > > >>>>>>>>
> > > > > >>>>>>>> at
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> > > > > >>>>>>>>
> > > > > >>>>>>>> at
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> > > > > >>>>>>>>
> > > > > >>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> > > > > >>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to
> snap,
> > > > > >>>>> skipping
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> > > > > >>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> > > > > >>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
> > > > > >> Actually,
> > > > > >>>>> the
> > > > > >>>>>>>> issue happened with the combinations of
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> 3.4 client and 3.6 server
> > > > > >>>>>>>>
> > > > > >>>>>>>> 3.6 client and 3.6 server
> > > > > >>>>>>>>
> > > > > >>>>>>>> Please let me know if you need any additional info.
> > > > > >>>>>>>>
> > > > > >>>>>>>> Thanks,
> > > > > >>>>>>>>
> > > > > >>>>>>>> Li
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <
> li4wang@gmail.com>
> > > > > >>> wrote:
> > > > > >>>>>>>>
> > > > > >>>>>>>>> Hi Enrico,
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Thanks for the reply.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> 1. We are using direct NIO based stack, not Netty based
> > yet.
> > > > > >>>>>>>>> 2. Yes, on the client side, here are the metrics
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> 3.6:
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
> > > > > >>>>> eolivelli@gmail.com
> > > > > >>>>>>>
> > > > > >>>>>>>>> wrote:
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>> IIRC The main difference is about the switch to Netty 4
> > and
> > > > > >>> about
> > > > > >>>>>> using
> > > > > >>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Apart from that macro difference there have been many
> many
> > > > > >>> changes
> > > > > >>>>>>> since
> > > > > >>>>>>>>>> 3.4.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Do you have some metrics to share?
> > > > > >>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration
> > equals
> > > > > >> to
> > > > > >>>>> each
> > > > > >>>>>>>>>> other?
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Do you see warnings on the server logs?
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Did you upgrade both the client and the server or only
> the
> > > > > >>> server?
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Enrico
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com>
> ha
> > > > > >>> scritto:
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>> Hi,
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
> > > > > >>>>> perform/load
> > > > > >>>>>>>>>>> comparison test,  it was found that the performance of
> > 3.6
> > > > > >> has
> > > > > >>>>> been
> > > > > >>>>>>>>>>> significantly degraded compared to 3.4 for the write
> > > > > >>> operation.
> > > > > >>>>>> Under
> > > > > >>>>>>>>>> the
> > > > > >>>>>>>>>>> same load, there was a huge number of SessionExpired
> and
> > > > > >>>>>>> ConnectionLoss
> > > > > >>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> The load testing is 500 concurrent users with a cluster
> > of
> > > 5
> > > > > >>>>>>>>>> participants
> > > > > >>>>>>>>>>> and 5 observers. The min session timeout on the server
> > side
> > > > > >> is
> > > > > >>>>>>> 4000ms.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> I wonder if anyone has seen the same issue and has any
> > > > > >>> insights
> > > > > >>>>> on
> > > > > >>>>>>> what
> > > > > >>>>>>>>>>> could be the cause of the performance degradation.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> Thanks
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> Li
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>>
> > > > > >>>
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: write performance issue in 3.6.2

Posted by Michael Han <ha...@apache.org>.

Hi Li,

Thanks for following up.

>> write_commitproc_time_ms were large

This measures how long a local write op hears back from the leader. If it's
big, then either the leader is very busy acking the request, or your
network RTT is high.

How does the local fsync time (fsynctime) look like between two tests?

>> We've found that increasing the maxCommitBatchSize

Are you able to successfully tune this value so your benchmark against 3.6
is on par with 3.4 now? The original report mentioned lots of session
timeout and con loss. I am wondering if we can fix this first by tuning
this size.

The major difference of CommitProcessor between 3.4.14 and 3.6.2 is the
newly added per session commit queue such that reads and a write from
different sessions can be processed concurrently. This works best for mixed
read / write workloads, but for pure write workloads, the new
CommitProcessor is not superior, as all writes still have to be serialized
due to global ordering, plus the per session queue has the overhead for
example in this test ZK has to manage 500 queues (and enqueue / dequeue and
so on cost cycles). Though, I didn't expect this overhead can create such a
big difference in your test..

Also this is obvious but just want to confirm if the benchmark for two
versions of ZK was done on exact same test environment including same OS
and networking configuration?

On Mon, Apr 26, 2021 at 7:35 PM Li Wang <li...@gmail.com> wrote:

> Hi Michael,
>
> Thanks for your reply.
>
> 1. The workload is 500 concurrent users creating nodes with data size of 4
> bytes.
> 2. It's pure write
> 3. The perf issue is that under the same load, there were many session
> expired and connection loss errors when using ZK 3.6.2 but no such errors
> in ZK 3.4.14.
>
> The following are some updates on the issue.
>
> 1. We've checked the fine grained metrics and found that the
> CommitProcessor was the bottleneck. The commit_commit_proc_req_queued and
> the write_commitproc_time_ms were large.
> The errors were caused by too many commit requests queued up in the
> CommitProcessor and waiting to be processed.
> 2. We've found that increasing the maxCommitBatchSize can reduce both the
> session expired and connection loss errors.
> 3. We didn't observe any significant perf impact from the RequestThrottler.
>
>
> Please let me know if you or anyone has any questions.
>
> Thanks,
>
> Li
>
>
>
> On Tue, Apr 20, 2021 at 8:03 PM Michael Han <ha...@apache.org> wrote:
>
> > What is the workload looking like? Is it pure write, or mixed read write?
> >
> > A couple of ideas to move this forward:
> > * Publish the performance benchmark so the community can help.
> > * Bisect git commit and find the bad commit that caused the regression.
> > * Use the fine grained metrics introduced in 3.6 (e.g per processor stage
> > metrics) to measure where time spends during writes. We might have to add
> > these metrics on 3.4 to get a fair comparison.
> >
> > For the throttling - the RequestThrottler introduced in 3.6 does
> introduce
> > latency, but should not impact throughput this much.
> >
> > On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:
> >
> > > The CPU usage of both server and client are normal (< 50%) during the
> > test.
> > >
> > > Based on the investigation, the server is too busy with the load.
> > >
> > > The issue doesn't exist in 3.4.14. I wonder why there is a significant
> > > write performance degradation from 3.4.14 to 3.6.2 and how we can
> address
> > > the issue.
> > >
> > > Best,
> > >
> > > Li
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org>
> wrote:
> > >
> > > > What is the CPU usage of both server and client during the test?
> > > >
> > > > Looks like server is dropping the clients because either the server
> or
> > > > both are too busy to deal with the load.
> > > > This log line is also concerning: "Too busy to snap, skipping”
> > > >
> > > > If that’s the case I believe you'll have to profile the server
> process
> > to
> > > > figure out where the perf bottleneck is.
> > > >
> > > > Andor
> > > >
> > > >
> > > >
> > > >
> > > > > On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
> > > > >
> > > > > Thanks, Patrick.
> > > > >
> > > > > Yes, we are using the same JVM version and GC configurations when
> > > > > running the two tests. I have checked the GC metrics and also the
> > heap
> > > > dump
> > > > > of the 3.6, the GC pause and the memory usage look okay.
> > > > >
> > > > > Best,
> > > > >
> > > > > Li
> > > > >
> > > > > On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@apache.org>
> > wrote:
> > > > >
> > > > >> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com>
> wrote:
> > > > >>
> > > > >>> Hi Enrico, Sushant,
> > > > >>>
> > > > >>> I re-run the perf test with the data consistency check feature
> > > disabled
> > > > >>> (i.e. -Dzookeeper.digest.enabled=false), the write performance
> > issue
> > > of
> > > > >> 3.6
> > > > >>> is still there.
> > > > >>>
> > > > >>> With everything exactly the same, the throughput of 3.6 was only
> > 1/2
> > > of
> > > > >> 3.4
> > > > >>> and the max latency was more than 8 times.
> > > > >>>
> > > > >>> Any other points or thoughts?
> > > > >>>
> > > > >>>
> > > > >> In the past I've noticed a big impact of GC when doing certain
> > > > performance
> > > > >> measurements. I assume you are using the same JVM version and GC
> > when
> > > > >> running the two tests? Perhaps our memory footprint has expanded
> > over
> > > > time.
> > > > >> You should rule out GC by running with gc logging turned on with
> > both
> > > > >> versions and compare the impact.
> > > > >>
> > > > >> Regards,
> > > > >>
> > > > >> Patrick
> > > > >>
> > > > >>
> > > > >>> Cheers,
> > > > >>>
> > > > >>> Li
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com>
> wrote:
> > > > >>>
> > > > >>>> Thanks Sushant and Enrico!
> > > > >>>>
> > > > >>>> This is a really good point.  According to the 3.6
> documentation,
> > > the
> > > > >>>> feature is disabled by default.
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
> > > > >>> .
> > > > >>>> However, checking the code, the default is enabled.
> > > > >>>>
> > > > >>>> Let me set the zookeeper.digest.enabled to false and see how the
> > > write
> > > > >>>> operation performs.
> > > > >>>>
> > > > >>>> Best,
> > > > >>>>
> > > > >>>> Li
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
> > > sushantmane7@gmail.com>
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>> Hi Li,
> > > > >>>>>
> > > > >>>>> On 3.6.2 consistency checker (adhash based) is enabled by
> > default:
> > > > >>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
> > > > >>>>> .
> > > > >>>>> It is not present in ZK 3.4.14.
> > > > >>>>>
> > > > >>>>> This feature does have some impact on write performance.
> > > > >>>>>
> > > > >>>>> Thanks,
> > > > >>>>> Sushant
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
> > > > eolivelli@gmail.com
> > > > >>>
> > > > >>>>> wrote:
> > > > >>>>>
> > > > >>>>>> Li,
> > > > >>>>>> I wonder of we have some new throttling/back pressure
> mechanisms
> > > > >> that
> > > > >>> is
> > > > >>>>>> enabled by default.
> > > > >>>>>>
> > > > >>>>>> Does anyone has some pointer to relevant implementations?
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> Enrico
> > > > >>>>>>
> > > > >>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha
> > scritto:
> > > > >>>>>>
> > > > >>>>>>> Hi,
> > > > >>>>>>>
> > > > >>>>>>> We switched to Netty on both client side and server side and
> > the
> > > > >>>>>>> performance issue is still there.  Anyone has any insights on
> > > what
> > > > >>>>> could
> > > > >>>>>> be
> > > > >>>>>>> the cause of higher latency?
> > > > >>>>>>>
> > > > >>>>>>> Thanks,
> > > > >>>>>>>
> > > > >>>>>>> Li
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li...@gmail.com>
> > > > >> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> Hi Enrico,
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> Thanks for the reply.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> > > > >>>>>>>>
> > > > >>>>>>>> 2. Yes, here are some metrics on the client side.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,
> > Max
> > > > >>>>> Latency
> > > > >>>>>>> 31s
> > > > >>>>>>>>
> > > > >>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max
> > > > >>> Latency:
> > > > >>>>>> 1.6s
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> > > > >>>>>>>>
> > > > >>>>>>>> 10G of Heap
> > > > >>>>>>>>
> > > > >>>>>>>> 13G of Memory
> > > > >>>>>>>>
> > > > >>>>>>>> 5 Participante
> > > > >>>>>>>>
> > > > >>>>>>>> 5 Observere
> > > > >>>>>>>>
> > > > >>>>>>>> Client session timeout: 3000ms
> > > > >>>>>>>>
> > > > >>>>>>>> Server min session time: 4000ms
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 4. Yes, there are two types of  WARN logs and many “Expiring
> > > > >>>>> session”
> > > > >>>>>>>> INFO log
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> > > > >>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected
> exception
> > > > >>>>>>>>
> > > > >>>>>>>> EndOfStreamException: Unable to read additional data from
> > > > >> client,
> > > > >>> it
> > > > >>>>>>>> probably closed the socket: address = /100.108.63.116:43366
> ,
> > > > >>>>> session =
> > > > >>>>>>>> 0x400189fee9a000b
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>
> > > > >>
> > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> > > > >>>>>>>>
> > > > >>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> > > > >>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to snap,
> > > > >>>>> skipping
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> > > > >>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> > > > >>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
> > > > >> Actually,
> > > > >>>>> the
> > > > >>>>>>>> issue happened with the combinations of
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3.4 client and 3.6 server
> > > > >>>>>>>>
> > > > >>>>>>>> 3.6 client and 3.6 server
> > > > >>>>>>>>
> > > > >>>>>>>> Please let me know if you need any additional info.
> > > > >>>>>>>>
> > > > >>>>>>>> Thanks,
> > > > >>>>>>>>
> > > > >>>>>>>> Li
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <li...@gmail.com>
> > > > >>> wrote:
> > > > >>>>>>>>
> > > > >>>>>>>>> Hi Enrico,
> > > > >>>>>>>>>
> > > > >>>>>>>>> Thanks for the reply.
> > > > >>>>>>>>>
> > > > >>>>>>>>> 1. We are using direct NIO based stack, not Netty based
> yet.
> > > > >>>>>>>>> 2. Yes, on the client side, here are the metrics
> > > > >>>>>>>>>
> > > > >>>>>>>>> 3.6:
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
> > > > >>>>> eolivelli@gmail.com
> > > > >>>>>>>
> > > > >>>>>>>>> wrote:
> > > > >>>>>>>>>
> > > > >>>>>>>>>> IIRC The main difference is about the switch to Netty 4
> and
> > > > >>> about
> > > > >>>>>> using
> > > > >>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Apart from that macro difference there have been many many
> > > > >>> changes
> > > > >>>>>>> since
> > > > >>>>>>>>>> 3.4.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Do you have some metrics to share?
> > > > >>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration
> equals
> > > > >> to
> > > > >>>>> each
> > > > >>>>>>>>>> other?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Do you see warnings on the server logs?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Did you upgrade both the client and the server or only the
> > > > >>> server?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Enrico
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com> ha
> > > > >>> scritto:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> Hi,
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
> > > > >>>>> perform/load
> > > > >>>>>>>>>>> comparison test,  it was found that the performance of
> 3.6
> > > > >> has
> > > > >>>>> been
> > > > >>>>>>>>>>> significantly degraded compared to 3.4 for the write
> > > > >>> operation.
> > > > >>>>>> Under
> > > > >>>>>>>>>> the
> > > > >>>>>>>>>>> same load, there was a huge number of SessionExpired and
> > > > >>>>>>> ConnectionLoss
> > > > >>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> The load testing is 500 concurrent users with a cluster
> of
> > 5
> > > > >>>>>>>>>> participants
> > > > >>>>>>>>>>> and 5 observers. The min session timeout on the server
> side
> > > > >> is
> > > > >>>>>>> 4000ms.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> I wonder if anyone has seen the same issue and has any
> > > > >>> insights
> > > > >>>>> on
> > > > >>>>>>> what
> > > > >>>>>>>>>>> could be the cause of the performance degradation.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Thanks
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Li
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > > >
> > >
> >
>

Re: write performance issue in 3.6.2

Posted by Michael Han <ha...@apache.org>.

Hi Li,

Thanks for following up.

>> write_commitproc_time_ms were large

This measures how long a local write op hears back from the leader. If it's
big, then either the leader is very busy acking the request, or your
network RTT is high.

How does the local fsync time (fsynctime) look like between two tests?

>> We've found that increasing the maxCommitBatchSize

Are you able to successfully tune this value so your benchmark against 3.6
is on par with 3.4 now? The original report mentioned lots of session
timeout and con loss. I am wondering if we can fix this first by tuning
this size.

The major difference of CommitProcessor between 3.4.14 and 3.6.2 is the
newly added per session commit queue such that reads and a write from
different sessions can be processed concurrently. This works best for mixed
read / write workloads, but for pure write workloads, the new
CommitProcessor is not superior, as all writes still have to be serialized
due to global ordering, plus the per session queue has the overhead for
example in this test ZK has to manage 500 queues (and enqueue / dequeue and
so on cost cycles). Though, I didn't expect this overhead can create such a
big difference in your test..

Also this is obvious but just want to confirm if the benchmark for two
versions of ZK was done on exact same test environment including same OS
and networking configuration?

On Mon, Apr 26, 2021 at 7:35 PM Li Wang <li...@gmail.com> wrote:

> Hi Michael,
>
> Thanks for your reply.
>
> 1. The workload is 500 concurrent users creating nodes with data size of 4
> bytes.
> 2. It's pure write
> 3. The perf issue is that under the same load, there were many session
> expired and connection loss errors when using ZK 3.6.2 but no such errors
> in ZK 3.4.14.
>
> The following are some updates on the issue.
>
> 1. We've checked the fine grained metrics and found that the
> CommitProcessor was the bottleneck. The commit_commit_proc_req_queued and
> the write_commitproc_time_ms were large.
> The errors were caused by too many commit requests queued up in the
> CommitProcessor and waiting to be processed.
> 2. We've found that increasing the maxCommitBatchSize can reduce both the
> session expired and connection loss errors.
> 3. We didn't observe any significant perf impact from the RequestThrottler.
>
>
> Please let me know if you or anyone has any questions.
>
> Thanks,
>
> Li
>
>
>
> On Tue, Apr 20, 2021 at 8:03 PM Michael Han <ha...@apache.org> wrote:
>
> > What is the workload looking like? Is it pure write, or mixed read write?
> >
> > A couple of ideas to move this forward:
> > * Publish the performance benchmark so the community can help.
> > * Bisect git commit and find the bad commit that caused the regression.
> > * Use the fine grained metrics introduced in 3.6 (e.g per processor stage
> > metrics) to measure where time spends during writes. We might have to add
> > these metrics on 3.4 to get a fair comparison.
> >
> > For the throttling - the RequestThrottler introduced in 3.6 does
> introduce
> > latency, but should not impact throughput this much.
> >
> > On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:
> >
> > > The CPU usage of both server and client are normal (< 50%) during the
> > test.
> > >
> > > Based on the investigation, the server is too busy with the load.
> > >
> > > The issue doesn't exist in 3.4.14. I wonder why there is a significant
> > > write performance degradation from 3.4.14 to 3.6.2 and how we can
> address
> > > the issue.
> > >
> > > Best,
> > >
> > > Li
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org>
> wrote:
> > >
> > > > What is the CPU usage of both server and client during the test?
> > > >
> > > > Looks like server is dropping the clients because either the server
> or
> > > > both are too busy to deal with the load.
> > > > This log line is also concerning: "Too busy to snap, skipping”
> > > >
> > > > If that’s the case I believe you'll have to profile the server
> process
> > to
> > > > figure out where the perf bottleneck is.
> > > >
> > > > Andor
> > > >
> > > >
> > > >
> > > >
> > > > > On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
> > > > >
> > > > > Thanks, Patrick.
> > > > >
> > > > > Yes, we are using the same JVM version and GC configurations when
> > > > > running the two tests. I have checked the GC metrics and also the
> > heap
> > > > dump
> > > > > of the 3.6, the GC pause and the memory usage look okay.
> > > > >
> > > > > Best,
> > > > >
> > > > > Li
> > > > >
> > > > > On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@apache.org>
> > wrote:
> > > > >
> > > > >> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com>
> wrote:
> > > > >>
> > > > >>> Hi Enrico, Sushant,
> > > > >>>
> > > > >>> I re-run the perf test with the data consistency check feature
> > > disabled
> > > > >>> (i.e. -Dzookeeper.digest.enabled=false), the write performance
> > issue
> > > of
> > > > >> 3.6
> > > > >>> is still there.
> > > > >>>
> > > > >>> With everything exactly the same, the throughput of 3.6 was only
> > 1/2
> > > of
> > > > >> 3.4
> > > > >>> and the max latency was more than 8 times.
> > > > >>>
> > > > >>> Any other points or thoughts?
> > > > >>>
> > > > >>>
> > > > >> In the past I've noticed a big impact of GC when doing certain
> > > > performance
> > > > >> measurements. I assume you are using the same JVM version and GC
> > when
> > > > >> running the two tests? Perhaps our memory footprint has expanded
> > over
> > > > time.
> > > > >> You should rule out GC by running with gc logging turned on with
> > both
> > > > >> versions and compare the impact.
> > > > >>
> > > > >> Regards,
> > > > >>
> > > > >> Patrick
> > > > >>
> > > > >>
> > > > >>> Cheers,
> > > > >>>
> > > > >>> Li
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com>
> wrote:
> > > > >>>
> > > > >>>> Thanks Sushant and Enrico!
> > > > >>>>
> > > > >>>> This is a really good point.  According to the 3.6
> documentation,
> > > the
> > > > >>>> feature is disabled by default.
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
> > > > >>> .
> > > > >>>> However, checking the code, the default is enabled.
> > > > >>>>
> > > > >>>> Let me set the zookeeper.digest.enabled to false and see how the
> > > write
> > > > >>>> operation performs.
> > > > >>>>
> > > > >>>> Best,
> > > > >>>>
> > > > >>>> Li
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
> > > sushantmane7@gmail.com>
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>> Hi Li,
> > > > >>>>>
> > > > >>>>> On 3.6.2 consistency checker (adhash based) is enabled by
> > default:
> > > > >>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
> > > > >>>>> .
> > > > >>>>> It is not present in ZK 3.4.14.
> > > > >>>>>
> > > > >>>>> This feature does have some impact on write performance.
> > > > >>>>>
> > > > >>>>> Thanks,
> > > > >>>>> Sushant
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
> > > > eolivelli@gmail.com
> > > > >>>
> > > > >>>>> wrote:
> > > > >>>>>
> > > > >>>>>> Li,
> > > > >>>>>> I wonder of we have some new throttling/back pressure
> mechanisms
> > > > >> that
> > > > >>> is
> > > > >>>>>> enabled by default.
> > > > >>>>>>
> > > > >>>>>> Does anyone has some pointer to relevant implementations?
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> Enrico
> > > > >>>>>>
> > > > >>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha
> > scritto:
> > > > >>>>>>
> > > > >>>>>>> Hi,
> > > > >>>>>>>
> > > > >>>>>>> We switched to Netty on both client side and server side and
> > the
> > > > >>>>>>> performance issue is still there.  Anyone has any insights on
> > > what
> > > > >>>>> could
> > > > >>>>>> be
> > > > >>>>>>> the cause of higher latency?
> > > > >>>>>>>
> > > > >>>>>>> Thanks,
> > > > >>>>>>>
> > > > >>>>>>> Li
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li...@gmail.com>
> > > > >> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> Hi Enrico,
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> Thanks for the reply.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> > > > >>>>>>>>
> > > > >>>>>>>> 2. Yes, here are some metrics on the client side.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,
> > Max
> > > > >>>>> Latency
> > > > >>>>>>> 31s
> > > > >>>>>>>>
> > > > >>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max
> > > > >>> Latency:
> > > > >>>>>> 1.6s
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> > > > >>>>>>>>
> > > > >>>>>>>> 10G of Heap
> > > > >>>>>>>>
> > > > >>>>>>>> 13G of Memory
> > > > >>>>>>>>
> > > > >>>>>>>> 5 Participante
> > > > >>>>>>>>
> > > > >>>>>>>> 5 Observere
> > > > >>>>>>>>
> > > > >>>>>>>> Client session timeout: 3000ms
> > > > >>>>>>>>
> > > > >>>>>>>> Server min session time: 4000ms
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 4. Yes, there are two types of  WARN logs and many “Expiring
> > > > >>>>> session”
> > > > >>>>>>>> INFO log
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> > > > >>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected
> exception
> > > > >>>>>>>>
> > > > >>>>>>>> EndOfStreamException: Unable to read additional data from
> > > > >> client,
> > > > >>> it
> > > > >>>>>>>> probably closed the socket: address = /100.108.63.116:43366
> ,
> > > > >>>>> session =
> > > > >>>>>>>> 0x400189fee9a000b
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>
> > > > >>
> > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> > > > >>>>>>>>
> > > > >>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> > > > >>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to snap,
> > > > >>>>> skipping
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> > > > >>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> > > > >>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
> > > > >> Actually,
> > > > >>>>> the
> > > > >>>>>>>> issue happened with the combinations of
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3.4 client and 3.6 server
> > > > >>>>>>>>
> > > > >>>>>>>> 3.6 client and 3.6 server
> > > > >>>>>>>>
> > > > >>>>>>>> Please let me know if you need any additional info.
> > > > >>>>>>>>
> > > > >>>>>>>> Thanks,
> > > > >>>>>>>>
> > > > >>>>>>>> Li
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <li...@gmail.com>
> > > > >>> wrote:
> > > > >>>>>>>>
> > > > >>>>>>>>> Hi Enrico,
> > > > >>>>>>>>>
> > > > >>>>>>>>> Thanks for the reply.
> > > > >>>>>>>>>
> > > > >>>>>>>>> 1. We are using direct NIO based stack, not Netty based
> yet.
> > > > >>>>>>>>> 2. Yes, on the client side, here are the metrics
> > > > >>>>>>>>>
> > > > >>>>>>>>> 3.6:
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
> > > > >>>>> eolivelli@gmail.com
> > > > >>>>>>>
> > > > >>>>>>>>> wrote:
> > > > >>>>>>>>>
> > > > >>>>>>>>>> IIRC The main difference is about the switch to Netty 4
> and
> > > > >>> about
> > > > >>>>>> using
> > > > >>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Apart from that macro difference there have been many many
> > > > >>> changes
> > > > >>>>>>> since
> > > > >>>>>>>>>> 3.4.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Do you have some metrics to share?
> > > > >>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration
> equals
> > > > >> to
> > > > >>>>> each
> > > > >>>>>>>>>> other?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Do you see warnings on the server logs?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Did you upgrade both the client and the server or only the
> > > > >>> server?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Enrico
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com> ha
> > > > >>> scritto:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> Hi,
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
> > > > >>>>> perform/load
> > > > >>>>>>>>>>> comparison test,  it was found that the performance of
> 3.6
> > > > >> has
> > > > >>>>> been
> > > > >>>>>>>>>>> significantly degraded compared to 3.4 for the write
> > > > >>> operation.
> > > > >>>>>> Under
> > > > >>>>>>>>>> the
> > > > >>>>>>>>>>> same load, there was a huge number of SessionExpired and
> > > > >>>>>>> ConnectionLoss
> > > > >>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> The load testing is 500 concurrent users with a cluster
> of
> > 5
> > > > >>>>>>>>>> participants
> > > > >>>>>>>>>>> and 5 observers. The min session timeout on the server
> side
> > > > >> is
> > > > >>>>>>> 4000ms.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> I wonder if anyone has seen the same issue and has any
> > > > >>> insights
> > > > >>>>> on
> > > > >>>>>>> what
> > > > >>>>>>>>>>> could be the cause of the performance degradation.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Thanks
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Li
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > > >
> > >
> >
>

Re: write performance issue in 3.6.2

Posted by Li Wang <li...@gmail.com>.

Hi Michael,

Thanks for your reply.

1. The workload is 500 concurrent users creating nodes with data size of 4
bytes.
2. It's pure write
3. The perf issue is that under the same load, there were many session
expired and connection loss errors when using ZK 3.6.2 but no such errors
in ZK 3.4.14.

The following are some updates on the issue.

1. We've checked the fine grained metrics and found that the
CommitProcessor was the bottleneck. The commit_commit_proc_req_queued and
the write_commitproc_time_ms were large.
The errors were caused by too many commit requests queued up in the
CommitProcessor and waiting to be processed.
2. We've found that increasing the maxCommitBatchSize can reduce both the
session expired and connection loss errors.
3. We didn't observe any significant perf impact from the RequestThrottler.


Please let me know if you or anyone has any questions.

Thanks,

Li



On Tue, Apr 20, 2021 at 8:03 PM Michael Han <ha...@apache.org> wrote:

> What is the workload looking like? Is it pure write, or mixed read write?
>
> A couple of ideas to move this forward:
> * Publish the performance benchmark so the community can help.
> * Bisect git commit and find the bad commit that caused the regression.
> * Use the fine grained metrics introduced in 3.6 (e.g per processor stage
> metrics) to measure where time spends during writes. We might have to add
> these metrics on 3.4 to get a fair comparison.
>
> For the throttling - the RequestThrottler introduced in 3.6 does introduce
> latency, but should not impact throughput this much.
>
> On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:
>
> > The CPU usage of both server and client are normal (< 50%) during the
> test.
> >
> > Based on the investigation, the server is too busy with the load.
> >
> > The issue doesn't exist in 3.4.14. I wonder why there is a significant
> > write performance degradation from 3.4.14 to 3.6.2 and how we can address
> > the issue.
> >
> > Best,
> >
> > Li
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org> wrote:
> >
> > > What is the CPU usage of both server and client during the test?
> > >
> > > Looks like server is dropping the clients because either the server or
> > > both are too busy to deal with the load.
> > > This log line is also concerning: "Too busy to snap, skipping”
> > >
> > > If that’s the case I believe you'll have to profile the server process
> to
> > > figure out where the perf bottleneck is.
> > >
> > > Andor
> > >
> > >
> > >
> > >
> > > > On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
> > > >
> > > > Thanks, Patrick.
> > > >
> > > > Yes, we are using the same JVM version and GC configurations when
> > > > running the two tests. I have checked the GC metrics and also the
> heap
> > > dump
> > > > of the 3.6, the GC pause and the memory usage look okay.
> > > >
> > > > Best,
> > > >
> > > > Li
> > > >
> > > > On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@apache.org>
> wrote:
> > > >
> > > >> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com> wrote:
> > > >>
> > > >>> Hi Enrico, Sushant,
> > > >>>
> > > >>> I re-run the perf test with the data consistency check feature
> > disabled
> > > >>> (i.e. -Dzookeeper.digest.enabled=false), the write performance
> issue
> > of
> > > >> 3.6
> > > >>> is still there.
> > > >>>
> > > >>> With everything exactly the same, the throughput of 3.6 was only
> 1/2
> > of
> > > >> 3.4
> > > >>> and the max latency was more than 8 times.
> > > >>>
> > > >>> Any other points or thoughts?
> > > >>>
> > > >>>
> > > >> In the past I've noticed a big impact of GC when doing certain
> > > performance
> > > >> measurements. I assume you are using the same JVM version and GC
> when
> > > >> running the two tests? Perhaps our memory footprint has expanded
> over
> > > time.
> > > >> You should rule out GC by running with gc logging turned on with
> both
> > > >> versions and compare the impact.
> > > >>
> > > >> Regards,
> > > >>
> > > >> Patrick
> > > >>
> > > >>
> > > >>> Cheers,
> > > >>>
> > > >>> Li
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com> wrote:
> > > >>>
> > > >>>> Thanks Sushant and Enrico!
> > > >>>>
> > > >>>> This is a really good point.  According to the 3.6 documentation,
> > the
> > > >>>> feature is disabled by default.
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
> > > >>> .
> > > >>>> However, checking the code, the default is enabled.
> > > >>>>
> > > >>>> Let me set the zookeeper.digest.enabled to false and see how the
> > write
> > > >>>> operation performs.
> > > >>>>
> > > >>>> Best,
> > > >>>>
> > > >>>> Li
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
> > sushantmane7@gmail.com>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Hi Li,
> > > >>>>>
> > > >>>>> On 3.6.2 consistency checker (adhash based) is enabled by
> default:
> > > >>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
> > > >>>>> .
> > > >>>>> It is not present in ZK 3.4.14.
> > > >>>>>
> > > >>>>> This feature does have some impact on write performance.
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>> Sushant
> > > >>>>>
> > > >>>>>
> > > >>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
> > > eolivelli@gmail.com
> > > >>>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>>> Li,
> > > >>>>>> I wonder of we have some new throttling/back pressure mechanisms
> > > >> that
> > > >>> is
> > > >>>>>> enabled by default.
> > > >>>>>>
> > > >>>>>> Does anyone has some pointer to relevant implementations?
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> Enrico
> > > >>>>>>
> > > >>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha
> scritto:
> > > >>>>>>
> > > >>>>>>> Hi,
> > > >>>>>>>
> > > >>>>>>> We switched to Netty on both client side and server side and
> the
> > > >>>>>>> performance issue is still there.  Anyone has any insights on
> > what
> > > >>>>> could
> > > >>>>>> be
> > > >>>>>>> the cause of higher latency?
> > > >>>>>>>
> > > >>>>>>> Thanks,
> > > >>>>>>>
> > > >>>>>>> Li
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li...@gmail.com>
> > > >> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi Enrico,
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> Thanks for the reply.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> > > >>>>>>>>
> > > >>>>>>>> 2. Yes, here are some metrics on the client side.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,
> Max
> > > >>>>> Latency
> > > >>>>>>> 31s
> > > >>>>>>>>
> > > >>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max
> > > >>> Latency:
> > > >>>>>> 1.6s
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> > > >>>>>>>>
> > > >>>>>>>> 10G of Heap
> > > >>>>>>>>
> > > >>>>>>>> 13G of Memory
> > > >>>>>>>>
> > > >>>>>>>> 5 Participante
> > > >>>>>>>>
> > > >>>>>>>> 5 Observere
> > > >>>>>>>>
> > > >>>>>>>> Client session timeout: 3000ms
> > > >>>>>>>>
> > > >>>>>>>> Server min session time: 4000ms
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 4. Yes, there are two types of  WARN logs and many “Expiring
> > > >>>>> session”
> > > >>>>>>>> INFO log
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> > > >>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected exception
> > > >>>>>>>>
> > > >>>>>>>> EndOfStreamException: Unable to read additional data from
> > > >> client,
> > > >>> it
> > > >>>>>>>> probably closed the socket: address = /100.108.63.116:43366,
> > > >>>>> session =
> > > >>>>>>>> 0x400189fee9a000b
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>
> > > >>
> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> > > >>>>>>>>
> > > >>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> > > >>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to snap,
> > > >>>>> skipping
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> > > >>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> > > >>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
> > > >> Actually,
> > > >>>>> the
> > > >>>>>>>> issue happened with the combinations of
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 3.4 client and 3.6 server
> > > >>>>>>>>
> > > >>>>>>>> 3.6 client and 3.6 server
> > > >>>>>>>>
> > > >>>>>>>> Please let me know if you need any additional info.
> > > >>>>>>>>
> > > >>>>>>>> Thanks,
> > > >>>>>>>>
> > > >>>>>>>> Li
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <li...@gmail.com>
> > > >>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Hi Enrico,
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks for the reply.
> > > >>>>>>>>>
> > > >>>>>>>>> 1. We are using direct NIO based stack, not Netty based yet.
> > > >>>>>>>>> 2. Yes, on the client side, here are the metrics
> > > >>>>>>>>>
> > > >>>>>>>>> 3.6:
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
> > > >>>>> eolivelli@gmail.com
> > > >>>>>>>
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> IIRC The main difference is about the switch to Netty 4 and
> > > >>> about
> > > >>>>>> using
> > > >>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Apart from that macro difference there have been many many
> > > >>> changes
> > > >>>>>>> since
> > > >>>>>>>>>> 3.4.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Do you have some metrics to share?
> > > >>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration equals
> > > >> to
> > > >>>>> each
> > > >>>>>>>>>> other?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Do you see warnings on the server logs?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Did you upgrade both the client and the server or only the
> > > >>> server?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Enrico
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com> ha
> > > >>> scritto:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Hi,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
> > > >>>>> perform/load
> > > >>>>>>>>>>> comparison test,  it was found that the performance of 3.6
> > > >> has
> > > >>>>> been
> > > >>>>>>>>>>> significantly degraded compared to 3.4 for the write
> > > >>> operation.
> > > >>>>>> Under
> > > >>>>>>>>>> the
> > > >>>>>>>>>>> same load, there was a huge number of SessionExpired and
> > > >>>>>>> ConnectionLoss
> > > >>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> The load testing is 500 concurrent users with a cluster of
> 5
> > > >>>>>>>>>> participants
> > > >>>>>>>>>>> and 5 observers. The min session timeout on the server side
> > > >> is
> > > >>>>>>> 4000ms.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I wonder if anyone has seen the same issue and has any
> > > >>> insights
> > > >>>>> on
> > > >>>>>>> what
> > > >>>>>>>>>>> could be the cause of the performance degradation.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Thanks
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Li
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> > >
> >
>

Re: write performance issue in 3.6.2

Posted by Li Wang <li...@gmail.com>.

Hi Michael,

Thanks for your reply.

1. The workload is 500 concurrent users creating nodes with data size of 4
bytes.
2. It's pure write
3. The perf issue is that under the same load, there were many session
expired and connection loss errors when using ZK 3.6.2 but no such errors
in ZK 3.4.14.

The following are some updates on the issue.

1. We've checked the fine grained metrics and found that the
CommitProcessor was the bottleneck. The commit_commit_proc_req_queued and
the write_commitproc_time_ms were large.
The errors were caused by too many commit requests queued up in the
CommitProcessor and waiting to be processed.
2. We've found that increasing the maxCommitBatchSize can reduce both the
session expired and connection loss errors.
3. We didn't observe any significant perf impact from the RequestThrottler.


Please let me know if you or anyone has any questions.

Thanks,

Li



On Tue, Apr 20, 2021 at 8:03 PM Michael Han <ha...@apache.org> wrote:

> What is the workload looking like? Is it pure write, or mixed read write?
>
> A couple of ideas to move this forward:
> * Publish the performance benchmark so the community can help.
> * Bisect git commit and find the bad commit that caused the regression.
> * Use the fine grained metrics introduced in 3.6 (e.g per processor stage
> metrics) to measure where time spends during writes. We might have to add
> these metrics on 3.4 to get a fair comparison.
>
> For the throttling - the RequestThrottler introduced in 3.6 does introduce
> latency, but should not impact throughput this much.
>
> On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@gmail.com> wrote:
>
> > The CPU usage of both server and client are normal (< 50%) during the
> test.
> >
> > Based on the investigation, the server is too busy with the load.
> >
> > The issue doesn't exist in 3.4.14. I wonder why there is a significant
> > write performance degradation from 3.4.14 to 3.6.2 and how we can address
> > the issue.
> >
> > Best,
> >
> > Li
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@apache.org> wrote:
> >
> > > What is the CPU usage of both server and client during the test?
> > >
> > > Looks like server is dropping the clients because either the server or
> > > both are too busy to deal with the load.
> > > This log line is also concerning: "Too busy to snap, skipping”
> > >
> > > If that’s the case I believe you'll have to profile the server process
> to
> > > figure out where the perf bottleneck is.
> > >
> > > Andor
> > >
> > >
> > >
> > >
> > > > On 2021. Feb 22., at 5:31, Li Wang <li...@gmail.com> wrote:
> > > >
> > > > Thanks, Patrick.
> > > >
> > > > Yes, we are using the same JVM version and GC configurations when
> > > > running the two tests. I have checked the GC metrics and also the
> heap
> > > dump
> > > > of the 3.6, the GC pause and the memory usage look okay.
> > > >
> > > > Best,
> > > >
> > > > Li
> > > >
> > > > On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@apache.org>
> wrote:
> > > >
> > > >> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@gmail.com> wrote:
> > > >>
> > > >>> Hi Enrico, Sushant,
> > > >>>
> > > >>> I re-run the perf test with the data consistency check feature
> > disabled
> > > >>> (i.e. -Dzookeeper.digest.enabled=false), the write performance
> issue
> > of
> > > >> 3.6
> > > >>> is still there.
> > > >>>
> > > >>> With everything exactly the same, the throughput of 3.6 was only
> 1/2
> > of
> > > >> 3.4
> > > >>> and the max latency was more than 8 times.
> > > >>>
> > > >>> Any other points or thoughts?
> > > >>>
> > > >>>
> > > >> In the past I've noticed a big impact of GC when doing certain
> > > performance
> > > >> measurements. I assume you are using the same JVM version and GC
> when
> > > >> running the two tests? Perhaps our memory footprint has expanded
> over
> > > time.
> > > >> You should rule out GC by running with gc logging turned on with
> both
> > > >> versions and compare the impact.
> > > >>
> > > >> Regards,
> > > >>
> > > >> Patrick
> > > >>
> > > >>
> > > >>> Cheers,
> > > >>>
> > > >>> Li
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@gmail.com> wrote:
> > > >>>
> > > >>>> Thanks Sushant and Enrico!
> > > >>>>
> > > >>>> This is a really good point.  According to the 3.6 documentation,
> > the
> > > >>>> feature is disabled by default.
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration
> > > >>> .
> > > >>>> However, checking the code, the default is enabled.
> > > >>>>
> > > >>>> Let me set the zookeeper.digest.enabled to false and see how the
> > write
> > > >>>> operation performs.
> > > >>>>
> > > >>>> Best,
> > > >>>>
> > > >>>> Li
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <
> > sushantmane7@gmail.com>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Hi Li,
> > > >>>>>
> > > >>>>> On 3.6.2 consistency checker (adhash based) is enabled by
> default:
> > > >>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136
> > > >>>>> .
> > > >>>>> It is not present in ZK 3.4.14.
> > > >>>>>
> > > >>>>> This feature does have some impact on write performance.
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>> Sushant
> > > >>>>>
> > > >>>>>
> > > >>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <
> > > eolivelli@gmail.com
> > > >>>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>>> Li,
> > > >>>>>> I wonder of we have some new throttling/back pressure mechanisms
> > > >> that
> > > >>> is
> > > >>>>>> enabled by default.
> > > >>>>>>
> > > >>>>>> Does anyone has some pointer to relevant implementations?
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> Enrico
> > > >>>>>>
> > > >>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@gmail.com> ha
> scritto:
> > > >>>>>>
> > > >>>>>>> Hi,
> > > >>>>>>>
> > > >>>>>>> We switched to Netty on both client side and server side and
> the
> > > >>>>>>> performance issue is still there.  Anyone has any insights on
> > what
> > > >>>>> could
> > > >>>>>> be
> > > >>>>>>> the cause of higher latency?
> > > >>>>>>>
> > > >>>>>>> Thanks,
> > > >>>>>>>
> > > >>>>>>> Li
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li...@gmail.com>
> > > >> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi Enrico,
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> Thanks for the reply.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> > > >>>>>>>>
> > > >>>>>>>> 2. Yes, here are some metrics on the client side.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,
> Max
> > > >>>>> Latency
> > > >>>>>>> 31s
> > > >>>>>>>>
> > > >>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max
> > > >>> Latency:
> > > >>>>>> 1.6s
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> > > >>>>>>>>
> > > >>>>>>>> 10G of Heap
> > > >>>>>>>>
> > > >>>>>>>> 13G of Memory
> > > >>>>>>>>
> > > >>>>>>>> 5 Participante
> > > >>>>>>>>
> > > >>>>>>>> 5 Observere
> > > >>>>>>>>
> > > >>>>>>>> Client session timeout: 3000ms
> > > >>>>>>>>
> > > >>>>>>>> Server min session time: 4000ms
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 4. Yes, there are two types of  WARN logs and many “Expiring
> > > >>>>> session”
> > > >>>>>>>> INFO log
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> > > >>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected exception
> > > >>>>>>>>
> > > >>>>>>>> EndOfStreamException: Unable to read additional data from
> > > >> client,
> > > >>> it
> > > >>>>>>>> probably closed the socket: address = /100.108.63.116:43366,
> > > >>>>> session =
> > > >>>>>>>> 0x400189fee9a000b
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>
> > > >>
> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> > > >>>>>>>>
> > > >>>>>>>> at
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> > > >>>>>>>>
> > > >>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> > > >>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to snap,
> > > >>>>> skipping
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> > > >>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> > > >>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.
> > > >> Actually,
> > > >>>>> the
> > > >>>>>>>> issue happened with the combinations of
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> 3.4 client and 3.6 server
> > > >>>>>>>>
> > > >>>>>>>> 3.6 client and 3.6 server
> > > >>>>>>>>
> > > >>>>>>>> Please let me know if you need any additional info.
> > > >>>>>>>>
> > > >>>>>>>> Thanks,
> > > >>>>>>>>
> > > >>>>>>>> Li
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <li...@gmail.com>
> > > >>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Hi Enrico,
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks for the reply.
> > > >>>>>>>>>
> > > >>>>>>>>> 1. We are using direct NIO based stack, not Netty based yet.
> > > >>>>>>>>> 2. Yes, on the client side, here are the metrics
> > > >>>>>>>>>
> > > >>>>>>>>> 3.6:
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <
> > > >>>>> eolivelli@gmail.com
> > > >>>>>>>
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> IIRC The main difference is about the switch to Netty 4 and
> > > >>> about
> > > >>>>>> using
> > > >>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Apart from that macro difference there have been many many
> > > >>> changes
> > > >>>>>>> since
> > > >>>>>>>>>> 3.4.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Do you have some metrics to share?
> > > >>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration equals
> > > >> to
> > > >>>>> each
> > > >>>>>>>>>> other?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Do you see warnings on the server logs?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Did you upgrade both the client and the server or only the
> > > >>> server?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Enrico
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@gmail.com> ha
> > > >>> scritto:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Hi,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the
> > > >>>>> perform/load
> > > >>>>>>>>>>> comparison test,  it was found that the performance of 3.6
> > > >> has
> > > >>>>> been
> > > >>>>>>>>>>> significantly degraded compared to 3.4 for the write
> > > >>> operation.
> > > >>>>>> Under
> > > >>>>>>>>>> the
> > > >>>>>>>>>>> same load, there was a huge number of SessionExpired and
> > > >>>>>>> ConnectionLoss
> > > >>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> The load testing is 500 concurrent users with a cluster of
> 5
> > > >>>>>>>>>> participants
> > > >>>>>>>>>>> and 5 observers. The min session timeout on the server side
> > > >> is
> > > >>>>>>> 4000ms.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I wonder if anyone has seen the same issue and has any
> > > >>> insights
> > > >>>>> on
> > > >>>>>>> what
> > > >>>>>>>>>>> could be the cause of the performance degradation.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Thanks
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Li
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> > >
> >
>

Re: write performance issue in 3.6.2

Posted by Antoine Pitrou <an...@python.org>.

Can you explain why this is posted to the Arrow mailing-list?  This
does not seem relevant to Arrow.  If indeed it isn't, please remove the
Arrow mailing-list from the recipients.

Regards

Antoine.


On Wed, 21 Apr 2021 11:25:20 +0800
shrikant kalani
<sh...@gmail.com> wrote:
> Hello Everyone,
> 
> We are also using zookeeper 3.6.2 with ssl turned on both sides. We
> observed the same behaviour where under high write load the ZK server
> starts expiring the session. There are no jvm related issues. During high
> load the max latency increases significantly.
> 
> Also the session expiration message is not accurate. We do have session
> expiration set to 40 sec but ZK server disconnects the client within 10 sec.
> 
> Also the logs prints throttling the request but ZK documentation says
> throttling is disabled by default. Can someone check the code once to see
> if it is enabled or disabled. I am not a developer and hence not familiar
> with java code.
> 
> Thanks
> Srikant Kalani
> 
> On Wed, 21 Apr 2021 at 11:03 AM, Michael Han <ha...@public.gmane.org> wrote:
> 
> > What is the workload looking like? Is it pure write, or mixed read write?
> >
> > A couple of ideas to move this forward:
> > * Publish the performance benchmark so the community can help.
> > * Bisect git commit and find the bad commit that caused the regression.
> > * Use the fine grained metrics introduced in 3.6 (e.g per processor stage
> > metrics) to measure where time spends during writes. We might have to add
> > these metrics on 3.4 to get a fair comparison.
> >
> > For the throttling - the RequestThrottler introduced in 3.6 does introduce
> > latency, but should not impact throughput this much.
> >
> > On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@public.gmane.org> wrote:
> >  
> > > The CPU usage of both server and client are normal (< 50%) during the  
> > test.  
> > >
> > > Based on the investigation, the server is too busy with the load.
> > >
> > > The issue doesn't exist in 3.4.14. I wonder why there is a significant
> > > write performance degradation from 3.4.14 to 3.6.2 and how we can address
> > > the issue.
> > >
> > > Best,
> > >
> > > Li
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@public.gmane.org> wrote:
> > >  
> > > > What is the CPU usage of both server and client during the test?
> > > >
> > > > Looks like server is dropping the clients because either the server or
> > > > both are too busy to deal with the load.
> > > > This log line is also concerning: "Too busy to snap, skipping”
> > > >
> > > > If that’s the case I believe you'll have to profile the server process  
> > to  
> > > > figure out where the perf bottleneck is.
> > > >
> > > > Andor
> > > >
> > > >
> > > >
> > > >  
> > > > > On 2021. Feb 22., at 5:31, Li Wang <li...@public.gmane.org> wrote:
> > > > >
> > > > > Thanks, Patrick.
> > > > >
> > > > > Yes, we are using the same JVM version and GC configurations when
> > > > > running the two tests. I have checked the GC metrics and also the  
> > heap  
> > > > dump  
> > > > > of the 3.6, the GC pause and the memory usage look okay.
> > > > >
> > > > > Best,
> > > > >
> > > > > Li
> > > > >
> > > > > On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@public.gmane.org>  
> > wrote:  
> > > > >  
> > > > >> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@public.gmane.org> wrote:
> > > > >>  
> > > > >>> Hi Enrico, Sushant,
> > > > >>>
> > > > >>> I re-run the perf test with the data consistency check feature  
> > > disabled  
> > > > >>> (i.e. -Dzookeeper.digest.enabled=false), the write performance  
> > issue  
> > > of  
> > > > >> 3.6  
> > > > >>> is still there.
> > > > >>>
> > > > >>> With everything exactly the same, the throughput of 3.6 was only  
> > 1/2  
> > > of  
> > > > >> 3.4  
> > > > >>> and the max latency was more than 8 times.
> > > > >>>
> > > > >>> Any other points or thoughts?
> > > > >>>
> > > > >>>  
> > > > >> In the past I've noticed a big impact of GC when doing certain  
> > > > performance  
> > > > >> measurements. I assume you are using the same JVM version and GC  
> > when  
> > > > >> running the two tests? Perhaps our memory footprint has expanded  
> > over  
> > > > time.  
> > > > >> You should rule out GC by running with gc logging turned on with  
> > both  
> > > > >> versions and compare the impact.
> > > > >>
> > > > >> Regards,
> > > > >>
> > > > >> Patrick
> > > > >>
> > > > >>  
> > > > >>> Cheers,
> > > > >>>
> > > > >>> Li
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@public.gmane.org> wrote:
> > > > >>>  
> > > > >>>> Thanks Sushant and Enrico!
> > > > >>>>
> > > > >>>> This is a really good point.  According to the 3.6 documentation,  
> > > the  
> > > > >>>> feature is disabled by default.
> > > > >>>>  
> > > > >>>  
> > > > >>  
> > > >  
> > >  
> > https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration  
> > > > >>> .  
> > > > >>>> However, checking the code, the default is enabled.
> > > > >>>>
> > > > >>>> Let me set the zookeeper.digest.enabled to false and see how the  
> > > write  
> > > > >>>> operation performs.
> > > > >>>>
> > > > >>>> Best,
> > > > >>>>
> > > > >>>> Li
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <  
> > > sushantmane7@gmail.com>  
> > > > >>>> wrote:
> > > > >>>>  
> > > > >>>>> Hi Li,
> > > > >>>>>
> > > > >>>>> On 3.6.2 consistency checker (adhash based) is enabled by  
> > default:  
> > > > >>>>>
> > > > >>>>>  
> > > > >>>  
> > > > >>  
> > > >  
> > >  
> > https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136  
> > > > >>>>> .
> > > > >>>>> It is not present in ZK 3.4.14.
> > > > >>>>>
> > > > >>>>> This feature does have some impact on write performance.
> > > > >>>>>
> > > > >>>>> Thanks,
> > > > >>>>> Sushant
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <  
> > > > eolivelli@gmail.com  
> > > > >>>  
> > > > >>>>> wrote:
> > > > >>>>>  
> > > > >>>>>> Li,
> > > > >>>>>> I wonder of we have some new throttling/back pressure mechanisms  
> > > > >> that  
> > > > >>> is  
> > > > >>>>>> enabled by default.
> > > > >>>>>>
> > > > >>>>>> Does anyone has some pointer to relevant implementations?
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> Enrico
> > > > >>>>>>
> > > > >>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@public.gmane.org> ha  
> > scritto:  
> > > > >>>>>>  
> > > > >>>>>>> Hi,
> > > > >>>>>>>
> > > > >>>>>>> We switched to Netty on both client side and server side and  
> > the  
> > > > >>>>>>> performance issue is still there.  Anyone has any insights on  
> > > what  
> > > > >>>>> could  
> > > > >>>>>> be  
> > > > >>>>>>> the cause of higher latency?
> > > > >>>>>>>
> > > > >>>>>>> Thanks,
> > > > >>>>>>>
> > > > >>>>>>> Li
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li...@public.gmane.org>  
> > > > >> wrote:  
> > > > >>>>>>>  
> > > > >>>>>>>> Hi Enrico,
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> Thanks for the reply.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> > > > >>>>>>>>
> > > > >>>>>>>> 2. Yes, here are some metrics on the client side.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,  
> > Max  
> > > > >>>>> Latency  
> > > > >>>>>>> 31s  
> > > > >>>>>>>>
> > > > >>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max  
> > > > >>> Latency:  
> > > > >>>>>> 1.6s  
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> > > > >>>>>>>>
> > > > >>>>>>>> 10G of Heap
> > > > >>>>>>>>
> > > > >>>>>>>> 13G of Memory
> > > > >>>>>>>>
> > > > >>>>>>>> 5 Participante
> > > > >>>>>>>>
> > > > >>>>>>>> 5 Observere
> > > > >>>>>>>>
> > > > >>>>>>>> Client session timeout: 3000ms
> > > > >>>>>>>>
> > > > >>>>>>>> Server min session time: 4000ms
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 4. Yes, there are two types of  WARN logs and many “Expiring  
> > > > >>>>> session”  
> > > > >>>>>>>> INFO log
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> > > > >>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected exception
> > > > >>>>>>>>
> > > > >>>>>>>> EndOfStreamException: Unable to read additional data from  
> > > > >> client,  
> > > > >>> it  
> > > > >>>>>>>> probably closed the socket: address = /100.108.63.116:43366,  
> > > > >>>>> session =  
> > > > >>>>>>>> 0x400189fee9a000b
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>  
> > > > >>>>>>>  
> > > > >>>>>>  
> > > > >>>>>  
> > > > >>>  
> > > > >>  
> > > >  
> > >  
> > org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)  
> > > > >>>>>>>>
> > > > >>>>>>>> at  
> > > > >>>>>>  
> > > > >>  
> > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)  
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>  
> > > > >>>>>>>  
> > > > >>>>>>  
> > > > >>>>>  
> > > > >>>  
> > > > >>  
> > > >  
> > >  
> > org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)  
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>  
> > > > >>>>>>>  
> > > > >>>>>>  
> > > > >>>>>  
> > > > >>>  
> > > > >>  
> > > >  
> > >  
> > org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)  
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>  
> > > > >>>>>>>  
> > > > >>>>>>  
> > > > >>>>>  
> > > > >>>  
> > > > >>  
> > > >  
> > >  
> > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)  
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>  
> > > > >>>>>>>  
> > > > >>>>>>  
> > > > >>>>>  
> > > > >>>  
> > > > >>  
> > > >  
> > >  
> > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)  
> > > > >>>>>>>>
> > > > >>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> > > > >>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to snap,  
> > > > >>>>> skipping  
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> > > > >>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> > > > >>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.  
> > > > >> Actually,  
> > > > >>>>> the  
> > > > >>>>>>>> issue happened with the combinations of
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3.4 client and 3.6 server
> > > > >>>>>>>>
> > > > >>>>>>>> 3.6 client and 3.6 server
> > > > >>>>>>>>
> > > > >>>>>>>> Please let me know if you need any additional info.
> > > > >>>>>>>>
> > > > >>>>>>>> Thanks,
> > > > >>>>>>>>
> > > > >>>>>>>> Li
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <li...@public.gmane.org>  
> > > > >>> wrote:  
> > > > >>>>>>>>  
> > > > >>>>>>>>> Hi Enrico,
> > > > >>>>>>>>>
> > > > >>>>>>>>> Thanks for the reply.
> > > > >>>>>>>>>
> > > > >>>>>>>>> 1. We are using direct NIO based stack, not Netty based yet.
> > > > >>>>>>>>> 2. Yes, on the client side, here are the metrics
> > > > >>>>>>>>>
> > > > >>>>>>>>> 3.6:
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <  
> > > > >>>>> eolivelli@gmail.com  
> > > > >>>>>>>  
> > > > >>>>>>>>> wrote:
> > > > >>>>>>>>>  
> > > > >>>>>>>>>> IIRC The main difference is about the switch to Netty 4 and  
> > > > >>> about  
> > > > >>>>>> using  
> > > > >>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Apart from that macro difference there have been many many  
> > > > >>> changes  
> > > > >>>>>>> since  
> > > > >>>>>>>>>> 3.4.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Do you have some metrics to share?
> > > > >>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration equals  
> > > > >> to  
> > > > >>>>> each  
> > > > >>>>>>>>>> other?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Do you see warnings on the server logs?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Did you upgrade both the client and the server or only the  
> > > > >>> server?  
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Enrico
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@public.gmane.org> ha  
> > > > >>> scritto:  
> > > > >>>>>>>>>>  
> > > > >>>>>>>>>>> Hi,
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the  
> > > > >>>>> perform/load  
> > > > >>>>>>>>>>> comparison test,  it was found that the performance of 3.6  
> > > > >> has  
> > > > >>>>> been  
> > > > >>>>>>>>>>> significantly degraded compared to 3.4 for the write  
> > > > >>> operation.  
> > > > >>>>>> Under  
> > > > >>>>>>>>>> the  
> > > > >>>>>>>>>>> same load, there was a huge number of SessionExpired and  
> > > > >>>>>>> ConnectionLoss  
> > > > >>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> The load testing is 500 concurrent users with a cluster of  
> > 5  
> > > > >>>>>>>>>> participants  
> > > > >>>>>>>>>>> and 5 observers. The min session timeout on the server side  
> > > > >> is  
> > > > >>>>>>> 4000ms.  
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> I wonder if anyone has seen the same issue and has any  
> > > > >>> insights  
> > > > >>>>> on  
> > > > >>>>>>> what  
> > > > >>>>>>>>>>> could be the cause of the performance degradation.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Thanks
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Li
> > > > >>>>>>>>>>>  
> > > > >>>>>>>>>>  
> > > > >>>>>>>>>  
> > > > >>>>>>>  
> > > > >>>>>>  
> > > > >>>>>  
> > > > >>>>  
> > > > >>>  
> > > > >>  
> > > >
> > > >  
> > >  
> >  
>

Re: write performance issue in 3.6.2

Posted by Antoine Pitrou <an...@python.org>.

Can you explain why this is posted to the Arrow mailing-list?  This
does not seem relevant to Arrow.  If indeed it isn't, please remove the
Arrow mailing-list from the recipients.

Regards

Antoine.


On Wed, 21 Apr 2021 11:25:20 +0800
shrikant kalani
<sh...@gmail.com> wrote:
> Hello Everyone,
> 
> We are also using zookeeper 3.6.2 with ssl turned on both sides. We
> observed the same behaviour where under high write load the ZK server
> starts expiring the session. There are no jvm related issues. During high
> load the max latency increases significantly.
> 
> Also the session expiration message is not accurate. We do have session
> expiration set to 40 sec but ZK server disconnects the client within 10 sec.
> 
> Also the logs prints throttling the request but ZK documentation says
> throttling is disabled by default. Can someone check the code once to see
> if it is enabled or disabled. I am not a developer and hence not familiar
> with java code.
> 
> Thanks
> Srikant Kalani
> 
> On Wed, 21 Apr 2021 at 11:03 AM, Michael Han <ha...@public.gmane.org> wrote:
> 
> > What is the workload looking like? Is it pure write, or mixed read write?
> >
> > A couple of ideas to move this forward:
> > * Publish the performance benchmark so the community can help.
> > * Bisect git commit and find the bad commit that caused the regression.
> > * Use the fine grained metrics introduced in 3.6 (e.g per processor stage
> > metrics) to measure where time spends during writes. We might have to add
> > these metrics on 3.4 to get a fair comparison.
> >
> > For the throttling - the RequestThrottler introduced in 3.6 does introduce
> > latency, but should not impact throughput this much.
> >
> > On Thu, Mar 11, 2021 at 11:46 AM Li Wang <li...@public.gmane.org> wrote:
> >  
> > > The CPU usage of both server and client are normal (< 50%) during the  
> > test.  
> > >
> > > Based on the investigation, the server is too busy with the load.
> > >
> > > The issue doesn't exist in 3.4.14. I wonder why there is a significant
> > > write performance degradation from 3.4.14 to 3.6.2 and how we can address
> > > the issue.
> > >
> > > Best,
> > >
> > > Li
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Mar 11, 2021 at 11:25 AM Andor Molnar <an...@public.gmane.org> wrote:
> > >  
> > > > What is the CPU usage of both server and client during the test?
> > > >
> > > > Looks like server is dropping the clients because either the server or
> > > > both are too busy to deal with the load.
> > > > This log line is also concerning: "Too busy to snap, skipping”
> > > >
> > > > If that’s the case I believe you'll have to profile the server process  
> > to  
> > > > figure out where the perf bottleneck is.
> > > >
> > > > Andor
> > > >
> > > >
> > > >
> > > >  
> > > > > On 2021. Feb 22., at 5:31, Li Wang <li...@public.gmane.org> wrote:
> > > > >
> > > > > Thanks, Patrick.
> > > > >
> > > > > Yes, we are using the same JVM version and GC configurations when
> > > > > running the two tests. I have checked the GC metrics and also the  
> > heap  
> > > > dump  
> > > > > of the 3.6, the GC pause and the memory usage look okay.
> > > > >
> > > > > Best,
> > > > >
> > > > > Li
> > > > >
> > > > > On Sun, Feb 21, 2021 at 3:34 PM Patrick Hunt <ph...@public.gmane.org>  
> > wrote:  
> > > > >  
> > > > >> On Sun, Feb 21, 2021 at 3:28 PM Li Wang <li...@public.gmane.org> wrote:
> > > > >>  
> > > > >>> Hi Enrico, Sushant,
> > > > >>>
> > > > >>> I re-run the perf test with the data consistency check feature  
> > > disabled  
> > > > >>> (i.e. -Dzookeeper.digest.enabled=false), the write performance  
> > issue  
> > > of  
> > > > >> 3.6  
> > > > >>> is still there.
> > > > >>>
> > > > >>> With everything exactly the same, the throughput of 3.6 was only  
> > 1/2  
> > > of  
> > > > >> 3.4  
> > > > >>> and the max latency was more than 8 times.
> > > > >>>
> > > > >>> Any other points or thoughts?
> > > > >>>
> > > > >>>  
> > > > >> In the past I've noticed a big impact of GC when doing certain  
> > > > performance  
> > > > >> measurements. I assume you are using the same JVM version and GC  
> > when  
> > > > >> running the two tests? Perhaps our memory footprint has expanded  
> > over  
> > > > time.  
> > > > >> You should rule out GC by running with gc logging turned on with  
> > both  
> > > > >> versions and compare the impact.
> > > > >>
> > > > >> Regards,
> > > > >>
> > > > >> Patrick
> > > > >>
> > > > >>  
> > > > >>> Cheers,
> > > > >>>
> > > > >>> Li
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> On Sat, Feb 20, 2021 at 9:04 PM Li Wang <li...@public.gmane.org> wrote:
> > > > >>>  
> > > > >>>> Thanks Sushant and Enrico!
> > > > >>>>
> > > > >>>> This is a really good point.  According to the 3.6 documentation,  
> > > the  
> > > > >>>> feature is disabled by default.
> > > > >>>>  
> > > > >>>  
> > > > >>  
> > > >  
> > >  
> > https://zookeeper.apache.org/doc/r3.6.2/zookeeperAdmin.html#ch_administration  
> > > > >>> .  
> > > > >>>> However, checking the code, the default is enabled.
> > > > >>>>
> > > > >>>> Let me set the zookeeper.digest.enabled to false and see how the  
> > > write  
> > > > >>>> operation performs.
> > > > >>>>
> > > > >>>> Best,
> > > > >>>>
> > > > >>>> Li
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>> On Fri, Feb 19, 2021 at 1:32 PM Sushant Mane <  
> > > sushantmane7@gmail.com>  
> > > > >>>> wrote:
> > > > >>>>  
> > > > >>>>> Hi Li,
> > > > >>>>>
> > > > >>>>> On 3.6.2 consistency checker (adhash based) is enabled by  
> > default:  
> > > > >>>>>
> > > > >>>>>  
> > > > >>>  
> > > > >>  
> > > >  
> > >  
> > https://github.com/apache/zookeeper/blob/803c7f1a12f85978cb049af5e4ef23bd8b688715/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L136  
> > > > >>>>> .
> > > > >>>>> It is not present in ZK 3.4.14.
> > > > >>>>>
> > > > >>>>> This feature does have some impact on write performance.
> > > > >>>>>
> > > > >>>>> Thanks,
> > > > >>>>> Sushant
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Fri, Feb 19, 2021 at 12:50 PM Enrico Olivelli <  
> > > > eolivelli@gmail.com  
> > > > >>>  
> > > > >>>>> wrote:
> > > > >>>>>  
> > > > >>>>>> Li,
> > > > >>>>>> I wonder of we have some new throttling/back pressure mechanisms  
> > > > >> that  
> > > > >>> is  
> > > > >>>>>> enabled by default.
> > > > >>>>>>
> > > > >>>>>> Does anyone has some pointer to relevant implementations?
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> Enrico
> > > > >>>>>>
> > > > >>>>>> Il Ven 19 Feb 2021, 19:46 Li Wang <li...@public.gmane.org> ha  
> > scritto:  
> > > > >>>>>>  
> > > > >>>>>>> Hi,
> > > > >>>>>>>
> > > > >>>>>>> We switched to Netty on both client side and server side and  
> > the  
> > > > >>>>>>> performance issue is still there.  Anyone has any insights on  
> > > what  
> > > > >>>>> could  
> > > > >>>>>> be  
> > > > >>>>>>> the cause of higher latency?
> > > > >>>>>>>
> > > > >>>>>>> Thanks,
> > > > >>>>>>>
> > > > >>>>>>> Li
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Mon, Feb 15, 2021 at 2:17 PM Li Wang <li...@public.gmane.org>  
> > > > >> wrote:  
> > > > >>>>>>>  
> > > > >>>>>>>> Hi Enrico,
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> Thanks for the reply.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 1. We are using NIO based stack, not Netty based yet.
> > > > >>>>>>>>
> > > > >>>>>>>> 2. Yes, here are some metrics on the client side.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3.6: throughput: 7K, failure: 81215228, Avg Latency: 57ms,  
> > Max  
> > > > >>>>> Latency  
> > > > >>>>>>> 31s  
> > > > >>>>>>>>
> > > > >>>>>>>> 3.4: throughput: 15k, failure: 0,  Avg Latency: 30ms,  Max  
> > > > >>> Latency:  
> > > > >>>>>> 1.6s  
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3. Yes, the JVM and zoo.cfg config are the exact same
> > > > >>>>>>>>
> > > > >>>>>>>> 10G of Heap
> > > > >>>>>>>>
> > > > >>>>>>>> 13G of Memory
> > > > >>>>>>>>
> > > > >>>>>>>> 5 Participante
> > > > >>>>>>>>
> > > > >>>>>>>> 5 Observere
> > > > >>>>>>>>
> > > > >>>>>>>> Client session timeout: 3000ms
> > > > >>>>>>>>
> > > > >>>>>>>> Server min session time: 4000ms
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 4. Yes, there are two types of  WARN logs and many “Expiring  
> > > > >>>>> session”  
> > > > >>>>>>>> INFO log
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:04:36,506 [myid:4] - WARN
> > > > >>>>>>>> [NIOWorkerThread-7:NIOServerCnxn@365] - Unexpected exception
> > > > >>>>>>>>
> > > > >>>>>>>> EndOfStreamException: Unable to read additional data from  
> > > > >> client,  
> > > > >>> it  
> > > > >>>>>>>> probably closed the socket: address = /100.108.63.116:43366,  
> > > > >>>>> session =  
> > > > >>>>>>>> 0x400189fee9a000b
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>  
> > > > >>>>>>>  
> > > > >>>>>>  
> > > > >>>>>  
> > > > >>>  
> > > > >>  
> > > >  
> > >  
> > org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:164)  
> > > > >>>>>>>>
> > > > >>>>>>>> at  
> > > > >>>>>>  
> > > > >>  
> > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:327)  
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>  
> > > > >>>>>>>  
> > > > >>>>>>  
> > > > >>>>>  
> > > > >>>  
> > > > >>  
> > > >  
> > >  
> > org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)  
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>  
> > > > >>>>>>>  
> > > > >>>>>>  
> > > > >>>>>  
> > > > >>>  
> > > > >>  
> > > >  
> > >  
> > org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)  
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>  
> > > > >>>>>>>  
> > > > >>>>>>  
> > > > >>>>>  
> > > > >>>  
> > > > >>  
> > > >  
> > >  
> > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)  
> > > > >>>>>>>>
> > > > >>>>>>>> at
> > > > >>>>>>>>  
> > > > >>>>>>>  
> > > > >>>>>>  
> > > > >>>>>  
> > > > >>>  
> > > > >>  
> > > >  
> > >  
> > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)  
> > > > >>>>>>>>
> > > > >>>>>>>> at java.base/java.lang.Thread.run(Thread.java:834)
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:05:14,428 [myid:4] - WARN
> > > > >>>>>>>> [SyncThread:4:SyncRequestProcessor@188] - Too busy to snap,  
> > > > >>>>> skipping  
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 2021-02-15 22:01:51,427 [myid:4] - INFO
> > > > >>>>>>>> [SessionTracker:ZooKeeperServer@610] - Expiring session
> > > > >>>>>>>> 0x400189fee9a001e, timeout of 4000ms exceeded
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 5. Yes we upgrade both the client and the server to 3.6.  
> > > > >> Actually,  
> > > > >>>>> the  
> > > > >>>>>>>> issue happened with the combinations of
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> 3.4 client and 3.6 server
> > > > >>>>>>>>
> > > > >>>>>>>> 3.6 client and 3.6 server
> > > > >>>>>>>>
> > > > >>>>>>>> Please let me know if you need any additional info.
> > > > >>>>>>>>
> > > > >>>>>>>> Thanks,
> > > > >>>>>>>>
> > > > >>>>>>>> Li
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> On Mon, Feb 15, 2021 at 1:44 PM Li Wang <li...@public.gmane.org>  
> > > > >>> wrote:  
> > > > >>>>>>>>  
> > > > >>>>>>>>> Hi Enrico,
> > > > >>>>>>>>>
> > > > >>>>>>>>> Thanks for the reply.
> > > > >>>>>>>>>
> > > > >>>>>>>>> 1. We are using direct NIO based stack, not Netty based yet.
> > > > >>>>>>>>> 2. Yes, on the client side, here are the metrics
> > > > >>>>>>>>>
> > > > >>>>>>>>> 3.6:
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Mon, Feb 15, 2021 at 10:44 AM Enrico Olivelli <  
> > > > >>>>> eolivelli@gmail.com  
> > > > >>>>>>>  
> > > > >>>>>>>>> wrote:
> > > > >>>>>>>>>  
> > > > >>>>>>>>>> IIRC The main difference is about the switch to Netty 4 and  
> > > > >>> about  
> > > > >>>>>> using  
> > > > >>>>>>>>>> more DirectMemory. Are you using the Netty based stack?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Apart from that macro difference there have been many many  
> > > > >>> changes  
> > > > >>>>>>> since  
> > > > >>>>>>>>>> 3.4.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Do you have some metrics to share?
> > > > >>>>>>>>>> Are the  JVM configurations and zoo.cfg configuration equals  
> > > > >> to  
> > > > >>>>> each  
> > > > >>>>>>>>>> other?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Do you see warnings on the server logs?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Did you upgrade both the client and the server or only the  
> > > > >>> server?  
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Enrico
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Il Lun 15 Feb 2021, 18:30 Li Wang <li...@public.gmane.org> ha  
> > > > >>> scritto:  
> > > > >>>>>>>>>>  
> > > > >>>>>>>>>>> Hi,
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> We want to upgrade from 3.4.14 to 3.6.2.  During the  
> > > > >>>>> perform/load  
> > > > >>>>>>>>>>> comparison test,  it was found that the performance of 3.6  
> > > > >> has  
> > > > >>>>> been  
> > > > >>>>>>>>>>> significantly degraded compared to 3.4 for the write  
> > > > >>> operation.  
> > > > >>>>>> Under  
> > > > >>>>>>>>>> the  
> > > > >>>>>>>>>>> same load, there was a huge number of SessionExpired and  
> > > > >>>>>>> ConnectionLoss  
> > > > >>>>>>>>>>> errors in 3.6 while no such errors in 3.4.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> The load testing is 500 concurrent users with a cluster of  
> > 5  
> > > > >>>>>>>>>> participants  
> > > > >>>>>>>>>>> and 5 observers. The min session timeout on the server side  
> > > > >> is  
> > > > >>>>>>> 4000ms.  
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> I wonder if anyone has seen the same issue and has any  
> > > > >>> insights  
> > > > >>>>> on  
> > > > >>>>>>> what  
> > > > >>>>>>>>>>> could be the cause of the performance degradation.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Thanks
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Li
> > > > >>>>>>>>>>>  
> > > > >>>>>>>>>>  
> > > > >>>>>>>>>  
> > > > >>>>>>>  
> > > > >>>>>>  
> > > > >>>>>  
> > > > >>>>  
> > > > >>>  
> > > > >>  
> > > >
> > > >  
> > >  
> >  
>