You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/11/04 14:36:43 UTC

[GitHub] [pulsar] hadesy opened a new issue #8449: Pulsar CDC io.netty.util.internal.OutOfDirectMemoryError

hadesy opened a new issue #8449:
URL: https://github.com/apache/pulsar/issues/8449


   **Describe the bug**
   We create debezium source in the kubernetes cluster,the error on the broker is:
   
   ```
   14:13:21.677 [DL-io-0] ERROR org.apache.bookkeeper.common.allocator.impl.ByteBufAllocatorImpl - Unable to allocate memory
   io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 5234491399, max: 5242880000)
   	at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:742) ~[io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:697) ~[io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:758) ~[io.netty-netty-buffer-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:734) ~[io.netty-netty-buffer-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:245) ~[io.netty-netty-buffer-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.buffer.PoolArena.allocate(PoolArena.java:227) ~[io.netty-netty-buffer-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.buffer.PoolArena.allocate(PoolArena.java:147) ~[io.netty-netty-buffer-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:356) ~[io.netty-netty-buffer-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187) ~[io.netty-netty-buffer-4.1.48.Final.jar:4.1.48.Final]
   	at org.apache.bookkeeper.common.allocator.impl.ByteBufAllocatorImpl.newDirectBuffer(ByteBufAllocatorImpl.java:164) [org.apache.bookkeeper-bookkeeper-common-allocator-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.common.allocator.impl.ByteBufAllocatorImpl.newDirectBuffer(ByteBufAllocatorImpl.java:158) [org.apache.bookkeeper-bookkeeper-common-allocator-4.10.0.jar:4.10.0]
   	at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187) [io.netty-netty-buffer-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178) [io.netty-netty-buffer-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.epoll.AbstractEpollChannel.newDirectBuffer0(AbstractEpollChannel.java:328) [io.netty-netty-transport-native-epoll-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.epoll.AbstractEpollChannel.newDirectBuffer(AbstractEpollChannel.java:314) [io.netty-netty-transport-native-epoll-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.epoll.AbstractEpollChannel.newDirectBuffer(AbstractEpollChannel.java:297) [io.netty-netty-transport-native-epoll-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.epoll.AbstractEpollStreamChannel.filterOutboundMessage(AbstractEpollStreamChannel.java:521) [io.netty-netty-transport-native-epoll-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:873) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1367) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:717) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:709) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:792) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:702) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at org.apache.bookkeeper.util.ByteBufList$Encoder.write(ByteBufList.java:328) [org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:717) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:709) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:792) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:702) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:697) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.handler.codec.MessageToMessageEncoder.writePromiseCombiner(MessageToMessageEncoder.java:137) [io.netty-netty-codec-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:119) [io.netty-netty-codec-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:717) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:709) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:792) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:702) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at org.apache.bookkeeper.proto.BookieProtoEncoding$RequestEncoder.write(BookieProtoEncoding.java:391) [org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:717) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:709) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:792) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:702) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.ChannelDuplexHandler.write(ChannelDuplexHandler.java:115) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at org.apache.bookkeeper.proto.AuthHandler$ClientSideHandler.write(AuthHandler.java:336) [org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:717) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:764) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1104) [io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:387) [io.netty-netty-transport-native-epoll-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
   	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
   	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
   ```
   
   **To Reproduce**
   Steps to reproduce the behavior:
   
   1.Create a cluster
   `helm upgrade --install pulsar pulsar`
   
   2.Create a task
   ```
   PulsarAdmin admin = PulsarAdmin.builder()
                       .serviceHttpUrl("http://172.16.82.42")
                       .build();
               Map<String, Object> config = new HashMap<>();
               config.put("database.hostname", "127.0.0.1");
               config.put("database.port", "3306");
   			config.put("database.user", "test");
   			config.put("database.password", "test");
   			config.put("database.server.id", "184054");
   			config.put("database.server.name", "dbserver1");
   			config.put("database.whitelist", "test");
   			config.put("database.history", "org.apache.pulsar.io.debezium.PulsarDatabaseHistory");
   			config.put("database.history.pulsar.topic", "test");
   			config.put("database.history.pulsar.service.url", "pulsar://127.0.0.1:6650");
   			config.put("key.converter", "org.apache.kafka.connect.json.JsonConverter");
   			config.put("value.converter", "org.apache.kafka.connect.json.JsonConverter");
   			config.put("pulsar.service.url", "pulsar://127.0.0.1:6650");
   			config.put("offset.storage.topic", "offset");
   
               admin.sources().createSourceAsync(SourceConfig.builder()
                       .name("cdc-test")
                       .namespace("cdc")
                       .tenant("test")
                       .topicName("binlog_test")
                       .parallelism(1)
                       .archive("builtin://debezium-mysql")
                       .schemaType("json")
                       .configs(config)
                       .build(),null );
   ```
   3. OOM
   ![1](https://user-images.githubusercontent.com/6346047/98123275-9a9c9400-1eec-11eb-8a31-5f5fc05dc6f8.jpg)
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] hadesy commented on issue #8449: Pulsar CDC io.netty.util.internal.OutOfDirectMemoryError

Posted by GitBox <gi...@apache.org>.
hadesy commented on issue #8449:
URL: https://github.com/apache/pulsar/issues/8449#issuecomment-722752785


   @codelipenghui 2.6.1


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] codelipenghui commented on issue #8449: Pulsar CDC io.netty.util.internal.OutOfDirectMemoryError

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #8449:
URL: https://github.com/apache/pulsar/issues/8449#issuecomment-723373527


   @hadesy From your dashboard, the max direct memory is 5GB but from your configData, the max direct memory size is 256MB, I'm not sure if the config data is applied correctly.
   
   Anyway, Can you share the log of your broker? I want to check if there some other error logs excepts the oom log, maybe that would be some clues cause the ByteBuffer not properly released


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] hadesy commented on issue #8449: Pulsar CDC io.netty.util.internal.OutOfDirectMemoryError

Posted by GitBox <gi...@apache.org>.
hadesy commented on issue #8449:
URL: https://github.com/apache/pulsar/issues/8449#issuecomment-723399232


   @codelipenghui We only modified PULSAR_MEM
   `-Xms2048m -Xmx2048m -XX:MaxDirectMemorySize=5000m`
   We also checked the broker log and there were no more errors and this is a new cluster,I think it can be easily reproduced


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] hadesy commented on issue #8449: Pulsar CDC io.netty.util.internal.OutOfDirectMemoryError

Posted by GitBox <gi...@apache.org>.
hadesy commented on issue #8449:
URL: https://github.com/apache/pulsar/issues/8449#issuecomment-725473916


   we suspect that it is caused by incorrect load balancing settings,when we set it up properly, it works fine


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] codelipenghui commented on issue #8449: Pulsar CDC io.netty.util.internal.OutOfDirectMemoryError

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #8449:
URL: https://github.com/apache/pulsar/issues/8449#issuecomment-722735642


   @hadesy Which Pulsar version are you using? we have added the publish buffer limitation `maxMessagePublishBufferSizeInMB` for the broker in 2.5.1. If you are using an old version, you can try to add system.d for the broker, Or you can try to upgrade the broker version to the last release.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] codelipenghui commented on issue #8449: Pulsar CDC io.netty.util.internal.OutOfDirectMemoryError

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #8449:
URL: https://github.com/apache/pulsar/issues/8449#issuecomment-723401134


   @hadesy Ok, you can send the whole broker log here when the oom occurs. This will help to troubleshoot the problem.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] hadesy commented on issue #8449: Pulsar CDC io.netty.util.internal.OutOfDirectMemoryError

Posted by GitBox <gi...@apache.org>.
hadesy commented on issue #8449:
URL: https://github.com/apache/pulsar/issues/8449#issuecomment-722924233


   @codelipenghui We use the default policy in helm chart:
   ```
    configData:
       PULSAR_MEM: >
         -Xms128m -Xmx256m -XX:MaxDirectMemorySize=256m
       PULSAR_GC: >
         -XX:+UseG1GC
         -XX:MaxGCPauseMillis=10
         -Dio.netty.leakDetectionLevel=disabled
         -Dio.netty.recycler.linkCapacity=1024
         -XX:+ParallelRefProcEnabled
         -XX:+UnlockExperimentalVMOptions
         -XX:+DoEscapeAnalysis
         -XX:ParallelGCThreads=4
         -XX:ConcGCThreads=4
         -XX:G1NewSizePercent=50
         -XX:+DisableExplicitGC
         -XX:-ResizePLAB
         -XX:+ExitOnOutOfMemoryError
         -XX:+PerfDisableSharedMem
       managedLedgerDefaultEnsembleSize: "2"
       managedLedgerDefaultWriteQuorum: "2"
       managedLedgerDefaultAckQuorum: "2"
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] codelipenghui commented on issue #8449: Pulsar CDC io.netty.util.internal.OutOfDirectMemoryError

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #8449:
URL: https://github.com/apache/pulsar/issues/8449#issuecomment-722755491


   @hadesy ok, could you please provide the persistent policy(EnsembleSize, WriteQuorum, AckQuorum)? Currently, if the AckQuorum < WriteQuorum and there is one bookie always in the slow write state, this might cause the buffer can't release quickly.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] hadesy closed issue #8449: Pulsar CDC io.netty.util.internal.OutOfDirectMemoryError

Posted by GitBox <gi...@apache.org>.
hadesy closed issue #8449:
URL: https://github.com/apache/pulsar/issues/8449


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org