You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pulsar.apache.org by Tyler Landle <ty...@gmail.com> on 2019/02/20 00:34:28 UTC

Benchmarking Pulsar using OpenMessaging Benchmark

Hey Guys,

I have been trying to benchmark pulsar and get some E2E latency data from
it, and try to stress it in regards to End to End Latency. Namely, we have
been trying to stress it in a way that we would see End to End Latency
increase in some sort of way between say, 5ms and 50ms.

We have encountered some strange behavior doing this, and was wondering if
you guys had any insights on how to generate the information we are trying
to gather.

Weird behavior we are seeing:

1. The clients actually run out of resource space before the broker does
for a non persistent workload.

Using Openmessaging benchmark, we run workloads with number of topics
varying from 3-243 topics, and up to 600,000 msg/s, non persistent topics,
one broker. We are finding with under 6 local workers working as producers
and consumers, that the clients usually run out of resources first, skewing
our End to End latency statistics. Is this...in line with what you see? Is
it normal for clients to run out of resources before the broker starts
seeing latency gain?

2. If you raise rate on single topic, it will have higher end to end
latency. But if you add another topic at a lower rate, that end to end
latency does not increase until resource utilization is constrained.

We ran the following experiments(1 broker, non persistent topics)

2 local workers run a workload on 50 topics with 200,000 msg/s aggregate
message rate. 2 Other local workers run a workload on 1 topic at 10,000
msg/s, with 1 broker on non persistent topic. The single topic at 10,000
msg/s sees latency around 2ms, while the "background" workload of 50 topics
with 200,000 msg/s sees latency in the 20ms range. Do you have any idea why
we would see this behavior?



Overall, we are looking to stress the brokers without stressing the clients
first to see how number of topics and message rate affects pulsar as a
whole. Rather than seeing any linear or explainable increase in latency, we
have been seeing a pretty flat latency curve(2-5ms) followed by a huge
spike at some workload level(somewhere around 100ms). Is there some way
that you know of to see some kind of normalized latency increase that is
not due to resource utilization(our first guess as to why we see such high
spikes in latency).

Would this be better for the Users group? I wasn't really sure which one to
send to, but the devs may have a better idea for some of the behavior that
we are seeing.

Thanks,
Tyler Landle

Re: Benchmarking Pulsar using OpenMessaging Benchmark

Posted by Dave Fisher <wa...@apache.org>.

Since I moderated this email onto the list I am cc’ing the OP to assure that any replies are seen.

Regards,
Dave

> On Feb 19, 2019, at 4:34 PM, Tyler Landle <ty...@gmail.com> wrote:
> 
> Hey Guys,
> 
> I have been trying to benchmark pulsar and get some E2E latency data from
> it, and try to stress it in regards to End to End Latency. Namely, we have
> been trying to stress it in a way that we would see End to End Latency
> increase in some sort of way between say, 5ms and 50ms.
> 
> We have encountered some strange behavior doing this, and was wondering if
> you guys had any insights on how to generate the information we are trying
> to gather.
> 
> Weird behavior we are seeing:
> 
> 1. The clients actually run out of resource space before the broker does
> for a non persistent workload.
> 
> Using Openmessaging benchmark, we run workloads with number of topics
> varying from 3-243 topics, and up to 600,000 msg/s, non persistent topics,
> one broker. We are finding with under 6 local workers working as producers
> and consumers, that the clients usually run out of resources first, skewing
> our End to End latency statistics. Is this...in line with what you see? Is
> it normal for clients to run out of resources before the broker starts
> seeing latency gain?
> 
> 2. If you raise rate on single topic, it will have higher end to end
> latency. But if you add another topic at a lower rate, that end to end
> latency does not increase until resource utilization is constrained.
> 
> We ran the following experiments(1 broker, non persistent topics)
> 
> 2 local workers run a workload on 50 topics with 200,000 msg/s aggregate
> message rate. 2 Other local workers run a workload on 1 topic at 10,000
> msg/s, with 1 broker on non persistent topic. The single topic at 10,000
> msg/s sees latency around 2ms, while the "background" workload of 50 topics
> with 200,000 msg/s sees latency in the 20ms range. Do you have any idea why
> we would see this behavior?
> 
> 
> 
> Overall, we are looking to stress the brokers without stressing the clients
> first to see how number of topics and message rate affects pulsar as a
> whole. Rather than seeing any linear or explainable increase in latency, we
> have been seeing a pretty flat latency curve(2-5ms) followed by a huge
> spike at some workload level(somewhere around 100ms). Is there some way
> that you know of to see some kind of normalized latency increase that is
> not due to resource utilization(our first guess as to why we see such high
> spikes in latency).
> 
> Would this be better for the Users group? I wasn't really sure which one to
> send to, but the devs may have a better idea for some of the behavior that
> we are seeing.
> 
> Thanks,
> Tyler Landle