You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by linlin <li...@apache.org> on 2021/07/29 16:37:03 UTC

[Discuss] Optimize the performance of creating Topic

Creating a topic will first check whether the topic already exists.
The verification will read all topics under the namespace, and then
traverse these topics to see if the topic already exists.
When there are a large number of topics under the namespace(about 300,000
topics),
less than 10 topics can be created in one second.

Without a distributed lock, this check is unreliable and costly.
I tried to delete this check and write to ZooKeeper directly. If the znode
already exists, it means the topic already exists.

Then, I found this scenario in the unit test:
The user already has a non-partitioned topic like `topic-name-partition-123`
Then, he wants to create a partitioned topic like `topic-name`.
It cannot be created successfully now.
When traversing all topics, prefix matching is also performed.

In order to solve this problem, I want to add a check for reserved words on
the topic creation interface,and the topic name is not allowed to contain
-partition-, but this may cause some compatibility problems.

I want to hear your opinions. Is there a better way?

Re: [Discuss] Optimize the performance of creating Topic

Posted by Lin Lin <li...@apache.org>.

On 2021/08/03 11:12:34, Ivan Kelly <iv...@apache.org> wrote: 
> > Creating a topic will first check whether the topic already exists.
> > The verification will read all topics under the namespace, and then
> > traverse these topics to see if the topic already exists.
> > When there are a large number of topics under the namespace(about 300,000
> > topics),
> > less than 10 topics can be created in one second.
> Why do we need to read all topics at all? We really just need to check
> whether TOPIC or TOPIC-partition-0 exist.
> 
> Even if they do not exist, is there anything to stop one client
> creating TOPIC and another creating TOPIC-partition-0?
> 
> -Ivan
> 

Such as the test case "testCreatePartitionedTopicHavingNonPartitionTopicWithPartitionSuffix". Some non partition topic has the partition suffix. In this case, we can not use the cache to check anymore. And we have to traverse

Re: [Discuss] Optimize the performance of creating Topic

Posted by Ivan Kelly <iv...@apache.org>.
> Creating a topic will first check whether the topic already exists.
> The verification will read all topics under the namespace, and then
> traverse these topics to see if the topic already exists.
> When there are a large number of topics under the namespace(about 300,000
> topics),
> less than 10 topics can be created in one second.
Why do we need to read all topics at all? We really just need to check
whether TOPIC or TOPIC-partition-0 exist.

Even if they do not exist, is there anything to stop one client
creating TOPIC and another creating TOPIC-partition-0?

-Ivan