You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2018/11/06 09:11:02 UTC
Slack digest for #general - 2018-11-06

2018-11-05 12:28:46 UTC - pulsar-zhongjj: @pulsar-zhongjj has joined the channel
----
2018-11-05 14:05:26 UTC - Rodrigo Malacarne: Hello all, can anybody please help me to understand what I'm doing wrong?

I would like to test the max number of messages per second my Pulsar standalone (docker) can handle. For that purpose, I've created a very simple c++ producer with a ```while``` constantly sending a repeated message with approx. 30KB. After one minute, I was able to send approx. 85K messages, what I consider to be an extremely low value.

Looking at the network history, it's possible to see that the producer somehow hit a limit at around 240 KiB/s while sending to Pulsar (image attached).

For this test I've used two virtual machines, in the same internal network (100Mbit/s), running in VMWare Workstation. Both machines have a good amount of memory (16GB producer/12GB Pulsar) and 4 cores each.
----
2018-11-05 14:07:18 UTC - Rodrigo Malacarne: BTW, all messages are non-persistent.... I'm sure I'm doing something wrong, so I would be glad to hear how differently you would perform such a test.
----
2018-11-05 14:30:12 UTC - Ivan Kelly: the message is 30KB in size?
----
2018-11-05 14:30:27 UTC - Ivan Kelly: what does the C++ code look like?
----
2018-11-05 14:30:53 UTC - Ivan Kelly: and is that 85K messages / second?
----
2018-11-05 14:48:31 UTC - Rodrigo Malacarne: @Ivan Kelly , per minute ...
----
2018-11-05 14:50:23 UTC - Ivan Kelly: you using synchronous or asynchronous send?
----
2018-11-05 14:56:44 UTC - Rodrigo Malacarne: @Ivan Kelly Sync send 
----
2018-11-05 14:58:00 UTC - Ivan Kelly: it's somehting like ```
while (true) {
   producer.send("blah");
}
----
2018-11-05 14:58:18 UTC - Ivan Kelly: ?
----
2018-11-05 14:59:07 UTC - Ivan Kelly: if using sync send, you're waiting for the previous request to complete before starting the next. at 85k/minute, that's 1416 per second, so each request is taking around 700 micros
----
2018-11-05 14:59:31 UTC - Ivan Kelly: which is reasonable enough given that this is all on a single machine
----
2018-11-05 14:59:37 UTC - Ivan Kelly: if you want higher throughput, use async
----
2018-11-05 15:34:19 UTC - Rodrigo Malacarne: @Ivan Kelly Thank you very much. I’ll try async and verify the performance. In this case the message order is still guaranteed?
----
2018-11-05 15:35:00 UTC - Ivan Kelly: guarantees are much looser with non-persistent
----
2018-11-05 15:35:19 UTC - Ivan Kelly: for the most part, order should be preserved
----
2018-11-05 15:35:34 UTC - Ivan Kelly: but in failure cases, you'll have loses and possibly reorders
----
2018-11-05 16:10:43 UTC - Matteo Merli: In case of failure the client resends all the pending messages in the same order, so the final order is preserved. You will have dups (if deduping is not enabled) but the messages will still be in order. 
----
2018-11-05 16:15:08 UTC - Ivan Kelly: @Matteo Merli there deduping for non-persistent?
----
2018-11-05 16:23:14 UTC - Rodrigo Malacarne: @Matteo Merli and @Ivan Kelly, thank you very much. I've just run the test with sendAsync... This time I got around 570k messages in one minute.
----
2018-11-05 16:23:27 UTC - Rodrigo Malacarne: 
----
2018-11-05 16:24:59 UTC - Sijie Guo: Do you enable batching at client side?
----
2018-11-05 16:25:35 UTC - Rodrigo Malacarne: @Sijie Guo, where can I check it? I don't think so ...
----
2018-11-05 16:26:41 UTC - Sijie Guo: ProducerConfiguration has a setting for you to enable batching 
----
2018-11-05 16:30:03 UTC - Rodrigo Malacarne: @Sijie Guo This is the code I'm running...
----
2018-11-05 16:30:16 UTC - Rodrigo Malacarne: 
----
2018-11-05 16:31:03 UTC - Rodrigo Malacarne: @Sijie Guo I'm not sure if batching is enabled by default ...
----
2018-11-05 16:31:11 UTC - Rodrigo Malacarne: If it is, so yes, I'm using it ...
----
2018-11-05 16:31:25 UTC - Rodrigo Malacarne: Do you have this information?
----
2018-11-05 16:33:55 UTC - Ivan Kelly: it's default in java, but I don't know in C++
----
2018-11-05 16:34:16 UTC - Sijie Guo: <https://github.com/apache/pulsar/blob/master/pulsar-client-cpp/include/pulsar/ProducerConfiguration.h>
----
2018-11-05 16:34:35 UTC - Sijie Guo: You need to setBatchingEnabled 
----
2018-11-05 16:34:41 UTC - Sergey Zhemzhitsky: @Sergey Zhemzhitsky has joined the channel
----
2018-11-05 16:35:01 UTC - Ivan Kelly: Ya, default in c++ is false
----
2018-11-05 16:35:28 UTC - Sijie Guo: Also I think you should avoid logging every message, otherwise your program is bottlenecked at logging rather than pulsar 
----
2018-11-05 16:35:57 UTC - Rodrigo Malacarne: @Sijie Guo I'm only logging for testing purposes ...
----
2018-11-05 16:45:40 UTC - Grant Wu: Right, but logging has nontrivial overhead
----
2018-11-06 03:00:38 UTC - Joway Wang: @Joway Wang has joined the channel
----