You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/05/14 10:47:20 UTC
[GitHub] [pulsar] baynes opened a new issue #6961: Borderline too big messages hangs topic
baynes opened a new issue #6961:
URL: https://github.com/apache/pulsar/issues/6961
**Describe the bug**
If I send a message <= 5242781 bytes then it is handled OK.
If I send a message >= 5242881 bytes then it is rejected as MessageTooBig.
If I send a message between 5242782 and 5242880 bytes then the Pulsar broker starts continually using 100% of a CPU, the message does not appear on the topic and further attempts to send messages to that topic time out.
**To Reproduce**
My test code looks like this:
```
import pulsar
client = pulsar.Client("pulsar://localhost:6650")
producer = client.create_producer('persistent://public/default/large')
#for i in range(5243000,5242770,-1):
for i in range(5242770,5243000):
print( "%d" % i )
try:
producer.send( b"." * i)
#def callback(res, msg):
#print('Message {} published: {}'.format( msg, str(res) ) )
#producer.send_async( b"." * i, callback )
except Exception as e:
print( "Failed {}".format(e))
client.close()
```
Run the above code and when it reaches the magic value the broker will keep using CPU and following sends wait and then timeout.
You can swap the two loop statements to approach the problem from the other size.
I also experimented with send_async - this shows the same behavior though not so obvious and also seems to cause the broker to use unlimited memory if you keep tying to send.
**Expected behavior**
A clear and concise description of what you expected to happen.
**Screenshots**
Output from above code:
```
$ python3 tools/producer.py
2020-05-14 11:41:22.077 INFO ConnectionPool:85 | Created connection for pulsar://localhost:6650
2020-05-14 11:41:22.079 INFO ClientConnection:330 | [[::1]:34916 -> [::1]:6650] Connected to broker
2020-05-14 11:41:22.099 INFO HandlerBase:53 | [persistent://public/default/large, ] Getting connection from pool
2020-05-14 11:41:22.194 INFO ProducerImpl:151 | [persistent://public/default/large, ] Created producer on broker [[::1]:34916 -> [::1]:6650]
5242770
5242771
5242772
5242773
5242774
5242775
5242776
5242777
5242778
5242779
5242780
5242781
5242782
Failed Pulsar error: TimeOut
5242783
```
I killed the process at that point.
**Desktop (please complete the following information):**
- OS: Centos 7
**Additional context**
Using Python pulsar-client 2.5.1 installed with pip3 with Python 3.6.8 and Pulsar 2.5.0 broker.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] jiazhai commented on issue #6961: Borderline too big messages hangs topic
Posted by GitBox <gi...@apache.org>.
jiazhai commented on issue #6961:
URL: https://github.com/apache/pulsar/issues/6961#issuecomment-629900774
Hi @baynes What is the broker side log shows?
In broker.conf, there is an item:
```
# Max size of messages.
maxMessageSize=5242880
```
Usually you need to config it to support large size.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] baynes commented on issue #6961: Borderline too big messages hangs topic
Posted by GitBox <gi...@apache.org>.
baynes commented on issue #6961:
URL: https://github.com/apache/pulsar/issues/6961#issuecomment-638052311
>
>
> Hi @baynes What is the broker side log shows?
There is an awful lot in the log. This bit keeps repeating and looks relevant. [This is with pulsar 2.5.0]
```
09:36:34.692 [bookie-io-1-1] INFO org.apache.bookkeeper.proto.BookieRequestHandler - Channel connected [id: 0x55d7b04e, L:/127.0.0.1:3181 - R:/127.0.0.1:52372]
09:36:34.692 [bookie-io-1-1] ERROR org.apache.bookkeeper.proto.BookieRequestHandler - Unhandled exception occurred in I/O thread or handler
io.netty.handler.codec.TooLongFrameException: Adjusted frame length exceeds 5242880: 5242881 - discarded
at io.netty.handler.codec.LengthFieldBasedFrameDecoder.fail(LengthFieldBasedFrameDecoder.java:513) ~[io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
at io.netty.handler.codec.LengthFieldBasedFrameDecoder.failIfNecessary(LengthFieldBasedFrameDecoder.java:491) ~[io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
at io.netty.handler.codec.LengthFieldBasedFrameDecoder.exceededFrameLength(LengthFieldBasedFrameDecoder.java:378) ~[io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
at io.netty.handler.codec.LengthFieldBasedFrameDecoder.decode(LengthFieldBasedFrameDecoder.java:421) ~[io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
at io.netty.handler.codec.LengthFieldBasedFrameDecoder.decode(LengthFieldBasedFrameDecoder.java:334) ~[io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:503) ~[io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:442) ~[io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:281) ~[io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422) [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931) [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700) [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635) [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552) [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514) [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050) [io.netty-netty-common-4.1.43.Final.jar:4.1.43.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [io.netty-netty-common-4.1.43.Final.jar:4.1.43.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.43.Final.jar:4.1.43.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
09:36:34.693 [bookie-io-1-1] INFO org.apache.bookkeeper.proto.BookieRequestHandler - Channels disconnected: [id: 0x55d7b04e, L:/127.0.0.1:3181 ! R:/127.0.0.1:52372]
09:36:34.693 [bookkeeper-io-41-7] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Disconnected from bookie channel [id: 0x7da0db02, L:/127.0.0.1:52372 ! R:/127.0.0.1:3181]
09:36:34.693 [Thread-1] INFO org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.s.ServletContextHandler@4607cbe2{/metrics,null,UNAVAILABLE}
09:36:34.695 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN org.apache.bookkeeper.client.PendingAddOp - Failed to write entry (696, 0): Bookie handle is not available
09:36:34.695 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies [<Bookie:127.0.0.1:3181>], allBookies [<Bookie:127.0.0.1:3181>].
09:36:34.695 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to choose a bookie: excluded [<Bookie:127.0.0.1:3181>], fallback to choose bookie randomly from the cluster.
09:36:34.695 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies [<Bookie:127.0.0.1:3181>], allBookies [<Bookie:127.0.0.1:3181>].
09:36:34.695 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies [<Bookie:127.0.0.1:3181>], allBookies [<Bookie:127.0.0.1:3181>].
09:36:34.695 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to choose a bookie: excluded [<Bookie:127.0.0.1:3181>], fallback to choose bookie randomly from the cluster.
09:36:34.695 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies [<Bookie:127.0.0.1:3181>], allBookies [<Bookie:127.0.0.1:3181>].
09:36:34.695 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.MetadataUpdateLoop - UpdateLoop(ledgerId=696,loopId=70cd2f85) Exception updating
org.apache.bookkeeper.client.BKException$BKNotEnoughBookiesException: Not enough non-faulty bookies available
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectRandomInternal(RackawareEnsemblePlacementPolicyImpl.java:772) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectRandom(RackawareEnsemblePlacementPolicyImpl.java:690) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectFromNetworkLocation(RackawareEnsemblePlacementPolicyImpl.java:579) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.selectFromNetworkLocation(RackawareEnsemblePlacementPolicy.java:203) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectFromNetworkLocation(RackawareEnsemblePlacementPolicyImpl.java:539) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.selectFromNetworkLocation(RackawareEnsemblePlacementPolicy.java:224) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.replaceBookie(RackawareEnsemblePlacementPolicyImpl.java:467) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.replaceBookie(RackawareEnsemblePlacementPolicy.java:117) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.BookieWatcherImpl.replaceBookie(BookieWatcherImpl.java:306) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.EnsembleUtils.replaceBookiesInEnsemble(EnsembleUtils.java:71) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.LedgerHandle.lambda$ensembleChangeLoop$2(LedgerHandle.java:1920) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.MetadataUpdateLoop.writeLoop(MetadataUpdateLoop.java:122) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.MetadataUpdateLoop.run(MetadataUpdateLoop.java:111) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.LedgerHandle.ensembleChangeLoop(LedgerHandle.java:1939) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.LedgerHandle.handleBookieFailure(LedgerHandle.java:1888) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.PendingAddOp.writeComplete(PendingAddOp.java:377) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.proto.PerChannelBookieClient$AddCompletion.writeComplete(PerChannelBookieClient.java:2116) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.proto.PerChannelBookieClient$AddCompletion.lambda$errorOut$0(PerChannelBookieClient.java:2139) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.proto.PerChannelBookieClient$CompletionValue$1.safeRun(PerChannelBookieClient.java:1638) [org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.43.Final.jar:4.1.43.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
09:36:34.696 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN org.apache.bookkeeper.client.LedgerHandle - [EnsembleChange(ledger:696, change-id:0000000001)][attempt:1] Exception changing ensemble
org.apache.bookkeeper.client.BKException$BKNotEnoughBookiesException: Not enough non-faulty bookies available
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectRandomInternal(RackawareEnsemblePlacementPolicyImpl.java:772) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectRandom(RackawareEnsemblePlacementPolicyImpl.java:690) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectFromNetworkLocation(RackawareEnsemblePlacementPolicyImpl.java:579) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.selectFromNetworkLocation(RackawareEnsemblePlacementPolicy.java:203) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectFromNetworkLocation(RackawareEnsemblePlacementPolicyImpl.java:539) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.selectFromNetworkLocation(RackawareEnsemblePlacementPolicy.java:224) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.replaceBookie(RackawareEnsemblePlacementPolicyImpl.java:467) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.replaceBookie(RackawareEnsemblePlacementPolicy.java:117) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.BookieWatcherImpl.replaceBookie(BookieWatcherImpl.java:306) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.EnsembleUtils.replaceBookiesInEnsemble(EnsembleUtils.java:71) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.LedgerHandle.lambda$ensembleChangeLoop$2(LedgerHandle.java:1920) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.MetadataUpdateLoop.writeLoop(MetadataUpdateLoop.java:122) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.MetadataUpdateLoop.run(MetadataUpdateLoop.java:111) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.LedgerHandle.ensembleChangeLoop(LedgerHandle.java:1939) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.LedgerHandle.handleBookieFailure(LedgerHandle.java:1888) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.client.PendingAddOp.writeComplete(PendingAddOp.java:377) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.proto.PerChannelBookieClient$AddCompletion.writeComplete(PerChannelBookieClient.java:2116) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.proto.PerChannelBookieClient$AddCompletion.lambda$errorOut$0(PerChannelBookieClient.java:2139) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.proto.PerChannelBookieClient$CompletionValue$1.safeRun(PerChannelBookieClient.java:1638) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) ~[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.43.Final.jar:4.1.43.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
09:36:34.696 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.LedgerHandle - Closing ledger 696 due to NotEnoughBookiesException: Not enough non-faulty bookies available
09:36:34.696 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L696 E0
0
```
> In broker.conf, there is an item:
>
> ```
> # Max size of messages.
> maxMessageSize=5242880
> ```
>
> Usually you need to config it to support large size.
The default limit is quite large enough. for our needs. We are happy for exceptional messages to be rejected. We just don't want something exceptional taking the system out completely.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org