You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Vic Xu <vi...@gmail.com> on 2023/05/21 13:30:48 UTC

Kafka 3.2.1 performance issue with JDK 11

Hello all,  I have a Kafka cluster deployed with version 3.2.1 , JDK 11 and log4j 2.18.0. I built my own Kafka image. One of my Kafka brokers is experiencing CPU issues, and based on the jstack information, it seems that log4j is causing the problem due to its usage of StackWalker. How to solve this issue?

Here is jstack information:
"data-plane-kafka-request-handler-6" #59 daemon prio=5 os_prio=0 cpu=86381259.23ms elapsed=1948787.21s tid=0x00007f8939c04800 nid=0x190 runnable  [0x00007f883f6f5000]
   java.lang.Thread.State: RUNNABLE
	at java.lang.StackStreamFactory$AbstractStackWalker.fetchStackFrames(java.base@11.0.9/Native Method)
	at java.lang.StackStreamFactory$AbstractStackWalker.fetchStackFrames(java.base@11.0.9/Unknown Source)
	at java.lang.StackStreamFactory$AbstractStackWalker.getNextBatch(java.base@11.0.9/Unknown Source)
	at java.lang.StackStreamFactory$AbstractStackWalker.peekFrame(java.base@11.0.9/Unknown Source)
	at java.lang.StackStreamFactory$AbstractStackWalker.hasNext(java.base@11.0.9/Unknown Source)
	at java.lang.StackStreamFactory$StackFrameTraverser.tryAdvance(java.base@11.0.9/Unknown Source)
	at java.util.stream.ReferencePipeline.forEachWithCancel(java.base@11.0.9/Unknown Source)
	at java.util.stream.AbstractPipeline.copyIntoWithCancel(java.base@11.0.9/Unknown Source)
	at java.util.stream.AbstractPipeline.copyInto(java.base@11.0.9/Unknown Source)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(java.base@11.0.9/Unknown Source)
	at java.util.stream.FindOps$FindOp.evaluateSequential(java.base@11.0.9/Unknown Source)
	at java.util.stream.AbstractPipeline.evaluate(java.base@11.0.9/Unknown Source)
	at java.util.stream.ReferencePipeline.findFirst(java.base@11.0.9/Unknown Source)
	at org.apache.logging.log4j.util.StackLocator.lambda$getCallerClass$2(StackLocator.java:57)
	at org.apache.logging.log4j.util.StackLocator$$Lambda$117/0x00000008001a6c40.apply(Unknown Source)
	at java.lang.StackStreamFactory$StackFrameTraverser.consumeFrames(java.base@11.0.9/Unknown Source)
	at java.lang.StackStreamFactory$AbstractStackWalker.doStackWalk(java.base@11.0.9/Unknown Source)
	at java.lang.StackStreamFactory$AbstractStackWalker.callStackWalk(java.base@11.0.9/Native Method)
	at java.lang.StackStreamFactory$AbstractStackWalker.beginStackWalk(java.base@11.0.9/Unknown Source)
	at java.lang.StackStreamFactory$AbstractStackWalker.walk(java.base@11.0.9/Unknown Source)
	at java.lang.StackWalker.walk(java.base@11.0.9/Unknown Source)
	at org.apache.logging.log4j.util.StackLocator.getCallerClass(StackLocator.java:51)
	at org.apache.logging.log4j.util.StackLocatorUtil.getCallerClass(StackLocatorUtil.java:104)
	at org.apache.logging.slf4j.Log4jLoggerFactory.getContext(Log4jLoggerFactory.java:50)
	at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:47)
	at org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:33)
	at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:363)
	at kafka.utils.Logging.logger(Logging.scala:43)
	at kafka.utils.Logging.logger$(Logging.scala:43)
	at kafka.server.SessionlessFetchContext.logger$lzycompute(FetchSession.scala:364)
	- locked <0x00000007fa037e58> (a kafka.server.SessionlessFetchContext)
	at kafka.server.SessionlessFetchContext.logger(FetchSession.scala:364)
	at kafka.utils.Logging.debug(Logging.scala:62)
	at kafka.utils.Logging.debug$(Logging.scala:62)
	at kafka.server.SessionlessFetchContext.debug(FetchSession.scala:364)
	at kafka.server.SessionlessFetchContext.updateAndGenerateResponseData(FetchSession.scala:377)
	at kafka.server.KafkaApis.processResponseCallback$1(KafkaApis.scala:932)
	at kafka.server.KafkaApis.$anonfun$handleFetchRequest$33(KafkaApis.scala:965)
	at kafka.server.KafkaApis.$anonfun$handleFetchRequest$33$adapted(KafkaApis.scala:965)
	at kafka.server.KafkaApis$$Lambda$1241/0x00000008007e4040.apply(Unknown Source)

Re: Kafka 3.2.1 performance issue with JDK 11

Posted by Vic Xu <vi...@gmail.com>.
Hi Greg,

I found another possible solution that is upgrade JDK from 11 to 17. Do you
recommend this solution?

Greg Harris <gr...@aiven.io.invalid> 於 2023年5月22日 週一 上午1:59寫道:

> Vic,
>
> I found an open JIRA issue that previously reported this problem:
> https://issues.apache.org/jira/browse/KAFKA-10877 .
> I believe one workaround is to use log4j 1.x, such as reload4j. Kafka
> still relies on log4j 1.x until the planned upgrade to log4j 2.x in
> kafka 4.0 https://issues.apache.org/jira/browse/KAFKA-9366 .
> I will look into reviving or replacing the performance patch for 3.x.
>
> Hope this helps,
> Greg Harris
>
> On Sun, May 21, 2023 at 6:31 AM Vic Xu <vi...@gmail.com>
> wrote:
> >
> > Hello all,  I have a Kafka cluster deployed with version 3.2.1 , JDK 11
> and log4j 2.18.0. I built my own Kafka image. One of my Kafka brokers is
> experiencing CPU issues, and based on the jstack information, it seems that
> log4j is causing the problem due to its usage of StackWalker. How to solve
> this issue?
> >
> > Here is jstack information:
> > "data-plane-kafka-request-handler-6" #59 daemon prio=5 os_prio=0
> cpu=86381259.23ms elapsed=1948787.21s tid=0x00007f8939c04800 nid=0x190
> runnable  [0x00007f883f6f5000]
> >    java.lang.Thread.State: RUNNABLE
> >         at
> java.lang.StackStreamFactory$AbstractStackWalker.fetchStackFrames(java.base@11.0.9/Native
> Method)
> >         at
> java.lang.StackStreamFactory$AbstractStackWalker.fetchStackFrames(java.base@11.0.9/Unknown
> Source)
> >         at
> java.lang.StackStreamFactory$AbstractStackWalker.getNextBatch(java.base@11.0.9/Unknown
> Source)
> >         at
> java.lang.StackStreamFactory$AbstractStackWalker.peekFrame(java.base@11.0.9/Unknown
> Source)
> >         at
> java.lang.StackStreamFactory$AbstractStackWalker.hasNext(java.base@11.0.9/Unknown
> Source)
> >         at
> java.lang.StackStreamFactory$StackFrameTraverser.tryAdvance(java.base@11.0.9/Unknown
> Source)
> >         at
> java.util.stream.ReferencePipeline.forEachWithCancel(java.base@11.0.9/Unknown
> Source)
> >         at
> java.util.stream.AbstractPipeline.copyIntoWithCancel(java.base@11.0.9/Unknown
> Source)
> >         at java.util.stream.AbstractPipeline.copyInto(java.base@11.0.9/Unknown
> Source)
> >         at
> java.util.stream.AbstractPipeline.wrapAndCopyInto(java.base@11.0.9/Unknown
> Source)
> >         at
> java.util.stream.FindOps$FindOp.evaluateSequential(java.base@11.0.9/Unknown
> Source)
> >         at java.util.stream.AbstractPipeline.evaluate(java.base@11.0.9/Unknown
> Source)
> >         at java.util.stream.ReferencePipeline.findFirst(java.base@11.0.9/Unknown
> Source)
> >         at
> org.apache.logging.log4j.util.StackLocator.lambda$getCallerClass$2(StackLocator.java:57)
> >         at
> org.apache.logging.log4j.util.StackLocator$$Lambda$117/0x00000008001a6c40.apply(Unknown
> Source)
> >         at
> java.lang.StackStreamFactory$StackFrameTraverser.consumeFrames(java.base@11.0.9/Unknown
> Source)
> >         at
> java.lang.StackStreamFactory$AbstractStackWalker.doStackWalk(java.base@11.0.9/Unknown
> Source)
> >         at
> java.lang.StackStreamFactory$AbstractStackWalker.callStackWalk(java.base@11.0.9/Native
> Method)
> >         at
> java.lang.StackStreamFactory$AbstractStackWalker.beginStackWalk(java.base@11.0.9/Unknown
> Source)
> >         at
> java.lang.StackStreamFactory$AbstractStackWalker.walk(java.base@11.0.9/Unknown
> Source)
> >         at java.lang.StackWalker.walk(java.base@11.0.9/Unknown Source)
> >         at
> org.apache.logging.log4j.util.StackLocator.getCallerClass(StackLocator.java:51)
> >         at
> org.apache.logging.log4j.util.StackLocatorUtil.getCallerClass(StackLocatorUtil.java:104)
> >         at
> org.apache.logging.slf4j.Log4jLoggerFactory.getContext(Log4jLoggerFactory.java:50)
> >         at
> org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:47)
> >         at
> org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:33)
> >         at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:363)
> >         at kafka.utils.Logging.logger(Logging.scala:43)
> >         at kafka.utils.Logging.logger$(Logging.scala:43)
> >         at
> kafka.server.SessionlessFetchContext.logger$lzycompute(FetchSession.scala:364)
> >         - locked <0x00000007fa037e58> (a
> kafka.server.SessionlessFetchContext)
> >         at
> kafka.server.SessionlessFetchContext.logger(FetchSession.scala:364)
> >         at kafka.utils.Logging.debug(Logging.scala:62)
> >         at kafka.utils.Logging.debug$(Logging.scala:62)
> >         at
> kafka.server.SessionlessFetchContext.debug(FetchSession.scala:364)
> >         at
> kafka.server.SessionlessFetchContext.updateAndGenerateResponseData(FetchSession.scala:377)
> >         at
> kafka.server.KafkaApis.processResponseCallback$1(KafkaApis.scala:932)
> >         at
> kafka.server.KafkaApis.$anonfun$handleFetchRequest$33(KafkaApis.scala:965)
> >         at
> kafka.server.KafkaApis.$anonfun$handleFetchRequest$33$adapted(KafkaApis.scala:965)
> >         at
> kafka.server.KafkaApis$$Lambda$1241/0x00000008007e4040.apply(Unknown Source)
>

Re: Kafka 3.2.1 performance issue with JDK 11

Posted by Greg Harris <gr...@aiven.io.INVALID>.
Vic,

I found an open JIRA issue that previously reported this problem:
https://issues.apache.org/jira/browse/KAFKA-10877 .
I believe one workaround is to use log4j 1.x, such as reload4j. Kafka
still relies on log4j 1.x until the planned upgrade to log4j 2.x in
kafka 4.0 https://issues.apache.org/jira/browse/KAFKA-9366 .
I will look into reviving or replacing the performance patch for 3.x.

Hope this helps,
Greg Harris

On Sun, May 21, 2023 at 6:31 AM Vic Xu <vi...@gmail.com> wrote:
>
> Hello all,  I have a Kafka cluster deployed with version 3.2.1 , JDK 11 and log4j 2.18.0. I built my own Kafka image. One of my Kafka brokers is experiencing CPU issues, and based on the jstack information, it seems that log4j is causing the problem due to its usage of StackWalker. How to solve this issue?
>
> Here is jstack information:
> "data-plane-kafka-request-handler-6" #59 daemon prio=5 os_prio=0 cpu=86381259.23ms elapsed=1948787.21s tid=0x00007f8939c04800 nid=0x190 runnable  [0x00007f883f6f5000]
>    java.lang.Thread.State: RUNNABLE
>         at java.lang.StackStreamFactory$AbstractStackWalker.fetchStackFrames(java.base@11.0.9/Native Method)
>         at java.lang.StackStreamFactory$AbstractStackWalker.fetchStackFrames(java.base@11.0.9/Unknown Source)
>         at java.lang.StackStreamFactory$AbstractStackWalker.getNextBatch(java.base@11.0.9/Unknown Source)
>         at java.lang.StackStreamFactory$AbstractStackWalker.peekFrame(java.base@11.0.9/Unknown Source)
>         at java.lang.StackStreamFactory$AbstractStackWalker.hasNext(java.base@11.0.9/Unknown Source)
>         at java.lang.StackStreamFactory$StackFrameTraverser.tryAdvance(java.base@11.0.9/Unknown Source)
>         at java.util.stream.ReferencePipeline.forEachWithCancel(java.base@11.0.9/Unknown Source)
>         at java.util.stream.AbstractPipeline.copyIntoWithCancel(java.base@11.0.9/Unknown Source)
>         at java.util.stream.AbstractPipeline.copyInto(java.base@11.0.9/Unknown Source)
>         at java.util.stream.AbstractPipeline.wrapAndCopyInto(java.base@11.0.9/Unknown Source)
>         at java.util.stream.FindOps$FindOp.evaluateSequential(java.base@11.0.9/Unknown Source)
>         at java.util.stream.AbstractPipeline.evaluate(java.base@11.0.9/Unknown Source)
>         at java.util.stream.ReferencePipeline.findFirst(java.base@11.0.9/Unknown Source)
>         at org.apache.logging.log4j.util.StackLocator.lambda$getCallerClass$2(StackLocator.java:57)
>         at org.apache.logging.log4j.util.StackLocator$$Lambda$117/0x00000008001a6c40.apply(Unknown Source)
>         at java.lang.StackStreamFactory$StackFrameTraverser.consumeFrames(java.base@11.0.9/Unknown Source)
>         at java.lang.StackStreamFactory$AbstractStackWalker.doStackWalk(java.base@11.0.9/Unknown Source)
>         at java.lang.StackStreamFactory$AbstractStackWalker.callStackWalk(java.base@11.0.9/Native Method)
>         at java.lang.StackStreamFactory$AbstractStackWalker.beginStackWalk(java.base@11.0.9/Unknown Source)
>         at java.lang.StackStreamFactory$AbstractStackWalker.walk(java.base@11.0.9/Unknown Source)
>         at java.lang.StackWalker.walk(java.base@11.0.9/Unknown Source)
>         at org.apache.logging.log4j.util.StackLocator.getCallerClass(StackLocator.java:51)
>         at org.apache.logging.log4j.util.StackLocatorUtil.getCallerClass(StackLocatorUtil.java:104)
>         at org.apache.logging.slf4j.Log4jLoggerFactory.getContext(Log4jLoggerFactory.java:50)
>         at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:47)
>         at org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:33)
>         at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:363)
>         at kafka.utils.Logging.logger(Logging.scala:43)
>         at kafka.utils.Logging.logger$(Logging.scala:43)
>         at kafka.server.SessionlessFetchContext.logger$lzycompute(FetchSession.scala:364)
>         - locked <0x00000007fa037e58> (a kafka.server.SessionlessFetchContext)
>         at kafka.server.SessionlessFetchContext.logger(FetchSession.scala:364)
>         at kafka.utils.Logging.debug(Logging.scala:62)
>         at kafka.utils.Logging.debug$(Logging.scala:62)
>         at kafka.server.SessionlessFetchContext.debug(FetchSession.scala:364)
>         at kafka.server.SessionlessFetchContext.updateAndGenerateResponseData(FetchSession.scala:377)
>         at kafka.server.KafkaApis.processResponseCallback$1(KafkaApis.scala:932)
>         at kafka.server.KafkaApis.$anonfun$handleFetchRequest$33(KafkaApis.scala:965)
>         at kafka.server.KafkaApis.$anonfun$handleFetchRequest$33$adapted(KafkaApis.scala:965)
>         at kafka.server.KafkaApis$$Lambda$1241/0x00000008007e4040.apply(Unknown Source)