You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Xiangying Meng <xi...@apache.org> on 2023/11/30 05:26:41 UTC

[DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Hi, Pulsar Community

I drafted a proposal to make the configuration of clusters at the namespace
level clearer. This helps solve the problem of geo-replication not working
correctly at the topic level.

https://github.com/apache/pulsar/pull/21648

I'm looking forward to hearing from you.

BR
Xiangying

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by Rajan Dhabalia <rd...@apache.org>.
Hi,

I don't agree with your assumptions about allowed clusters and
replication-cluster definition. Allowed clusters defined at tenant level
will restrict tenants to create namespaces in regions/clusters where they
are not allowed. and replication clusters will help tenant to set up full
mesh replication for that namespace across the clusters defined in
replication-clusters. So, each configuration has a different meaning to
serve. If anyone wants to setup different replication clusters then either
those topics can be created under different namespaces or defined at topic
level policy. Pulsar is being used by many legacy systems and changing its
semantics for specific usecases without considering consequences are
creating a lot of pain and incompatibility problems for other existing
systems and let's avoid doing it as we are struggling with such changes and
breaking compatibility or changing semantics are just not acceptable.

Thanks,
Rajan

On Wed, Nov 29, 2023 at 9:26 PM Xiangying Meng <xi...@apache.org> wrote:

> Hi, Pulsar Community
>
> I drafted a proposal to make the configuration of clusters at the namespace
> level clearer. This helps solve the problem of geo-replication not working
> correctly at the topic level.
>
> https://github.com/apache/pulsar/pull/21648
>
> I'm looking forward to hearing from you.
>
> BR
> Xiangying
>

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by Yubiao Feng <yu...@streamnative.io.INVALID>.
Hi Xiangying

In my test, the topic-level replication works as expected when using a
different configuration metadata store but can not work as expected when
using the same configuration metadata store.

I have two questions:
- Do any exact users use the global config metadata store and still
need topic-level replication?
- There is a prop `allowed_clusters` on the tenant level and a prop
`replicated_clusters` on the namespace level. Is it possible that
`replicated_clusters` is designed for Replication but incorrectly used to
do a permission check? If yes, can we just remove this permission check?

Thanks
Yubiao Feng

On Thu, Nov 30, 2023 at 1:26 PM Xiangying Meng <xi...@apache.org> wrote:

> Hi, Pulsar Community
>
> I drafted a proposal to make the configuration of clusters at the namespace
> level clearer. This helps solve the problem of geo-replication not working
> correctly at the topic level.
>
> https://github.com/apache/pulsar/pull/21648
>
> I'm looking forward to hearing from you.
>
> BR
> Xiangying
>

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by Xiangying Meng <xi...@apache.org>.
Hi Penghui

>I'm sorry, I don't fully understand your point here. What is the "support
replication on message and topic level"?

>As I understand, are the `allowed-clusters` and `replication-clusters` more
concise options?

Pulsar support set replication-cluster for per message.
After this proposal, the `replication-clusters` at the topic/namespace
level could be the default value when the message level is not set, and
`allowed-clusters` could be the constraint when setting
`replication-cluster` for the message level.

I am struggling with whether we should introduce message level into the
title of this proposal.
Do you have any suggestions for the title of the proposal?

BR,
Xiangying

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by PengHui Li <pe...@apache.org>.
Hi Jiwei,

> I think  we can rename this PIP to:   *Introduce `allowed-clusters` and
`topic-policy-synchronized-clusters` to fully support replication on
message and topic level*

I'm sorry, I don't fully understand your point here. What is the "support
replication on message and topic level"?

As I understand, are the `allowed-clusters` and `replication-clusters` more
concise options?
Regards,
Penghui

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by PengHui Li <pe...@apache.org>.
As I understand, I think the topic is about the granularity and
flexibility of the resource hierarchy design.

- The namespace level granularity is good for most cases from the start.
But the business is changing, and the requirement for flexibility will be
engaged.

- Moving to another namespace for the flexibility requirement doesn't make
sense. One application doesn't mean all the topics that the application
used will follow the same policy. The different data the application
produces will have different levels of fault tolerance and security
requirements. If you move some topic topics to another couple of
namespaces, it will totally mess up the namespace management (caused by the
flexibility requirement). And the migration is also a challenge for users.

- The topic-level policy will provide the flexibility of the resource
hierarchy design, not only the geo-replication. As I know, many users are
using the topic level data retention, backlog quota, rate limit, and so on.
The namespace-level policy provides the default value, and some topics
under the namespace use the topic policy to get flexibility.

For a long-term view of the granularity and flexibility of the resource
hierarchy design. It can provide two major advantages.

- The default value of the sub-level
- The constraint of the sub-level

For example:

The default data retention of the namespace is 1GB, and the allowed maximum
data retention is 16 GB.

I think this proposal provides a good vision to resolve the problem.

Regards,
Penghui


On Fri, Dec 15, 2023 at 9:41 AM guo jiwei <te...@apache.org> wrote:

> Hi Xiangying,
>    I think  we can rename this PIP to:   *Introduce `allowed-clusters` and
> `topic-policy-synchronized-clusters` to fully support replication on
> message and topic level*
>    Currently, we can set replication clusters on the message and topic
> level, but the replication clusters should be a subset of the namespace
> replication clusters. which means :
>    If we set namespace replication clusters: cluster1, cluster2, cluster3,
> at most, these three or two clusters can be set on message or topic. If the
> user wanna set cluster4 or others, the replication
>    can't work as expected.
>    It's easy to reproduce by this test:
>
> >    @Test
>
>     public void testEnableReplicationInTopicLevel() throws Exception {
>
>         // 1. Create namespace and topic
>
>         String namespace =
> > BrokerTestUtil.newUniqueName("pulsar/testEnableReplicationInTopicLevel");
>
>         String topic1 = "persistent://" + namespace + "/topic-1";
>
>         admin1.namespaces().createNamespace(namespace);
>
>         admin1.topics().createNonPartitionedTopic(topic1);
>
>         // 2. Configure replication clusters for the topic.
>
>         admin1.topics().setReplicationClusters(topic1, List.of("r1",
> "r2"));
>
>         // 3. Check if the replicator connected successfully.
>
>         Awaitility.await().atMost(5, TimeUnit.MINUTES).untilAsserted(() ->
> {
>
>             List<String> keys = pulsar1.getBrokerService()
>
>                     .getTopic(topic1, false).get().get()
>
>                     .getReplicators().keys();
>
>             assertEquals(keys.size(), 1);
>
>             assertTrue(pulsar1.getBrokerService()
>
>                     .getTopic(topic1, false).get().get()
>
>                     .getReplicators().get(keys.get(0)).isConnected());
>
>         });
>
>     }
>
>
>   To fully support the replication, we find out an easy way to solve it.
> Introduce `allowed-clusters` on namespace policies, which Xiangying
> explains above.
>   How could this work and solve the issue? The same example :
>   If we set namespace replication clusters: cluster1, cluster2, cluster3,
> and
>            set topic1 replication clusters: cluster2, cluster4.
>            set topic2 replication clusters: cluster1, cluster4.
>   We must set `allowed-clusters` with cluster1, cluster2, cluster3, and
> cluster4.  The broker side will validate the topic or message replication
> clusters from the `allowed-cluster.`
>   In this way,  we can simplify more codes and logic here.
>   For *`topic-policy-synchronized-clusters` *we also add examples in the
> PIP.
>
>   Hope the explanation could help @Rajan @Girish
>
>
>
>
> Regards
> Jiwei Guo (Tboy)
>
>
> On Thu, Dec 7, 2023 at 10:29 PM Xiangying Meng <xi...@apache.org>
> wrote:
>
> > Hi Girish,
> >
> > I'm very pleased that we have reached some consensus now. Pulsar already
> > supports geo-replication at the topic level, but the existing
> > implementation of this topic level replication does not match our
> > expectations.
> >
> > At the moment, I can think of three directions to solve this problem:
> >
> > 1. Treat this issue as a bug and fix it so that Pulsar can truly support
> > replication at the topic level.
> > 2. Limit the replication topic policy, so that the replication clusters
> at
> > the topic level must be included in the replication clusters configured
> at
> > the namespace level. In this case, the topic level replication would
> serve
> > as a supplement to the namespace replication, rather than a true topic
> > level policy.
> > 3. Remove topic level replication.
> >
> > I lean towards the first option, as it would make Pulsar's replication
> > configuration more flexible and would not break the previous behavior
> > logic.
> >
> > >Yes, that's my viewpoint. In case that's not your view point, then in
> your
> > >use cases do you ever have more than one namespace inside a tenant?
> > >With every property coming at topic level, it makes no sense for the
> > >namespace hierarchy to exist anymore.
> >
> > I didn't propose this from the perspective of a user, but from the
> > perspective of a Pulsar maintainer. The replication cluster at the topic
> > level cannot function independently like other topic policies, and I
> > attempted to fix it after finding the reason.
> >
> > From the user's perspective, I could modify my system to put topics with
> > the same replication strategy under the same namespace. From the
> > maintainer's perspective, if a feature can help users use Pulsar more
> > flexibly and conveniently without introducing risks, then this feature
> > should be implemented. Perhaps business systems do not want to maintain
> too
> > many namespaces, as they would need to configure multiple namespace
> > policies or it might make their business logic complex. The other
> > configurations for topics under this namespace might be consistent, with
> > only a few topics needing to enable replication. In this case, topic
> level
> > replication becomes valuable. Therefore, I lean towards the first option,
> > to solve this problem and make it a truly expected topic policy.
> >
> > On Thu, Dec 7, 2023 at 12:45 PM Girish Sharma <sc...@gmail.com>
> > wrote:
> >
> > > Hello Xiangying,
> > >
> > >
> > > On Thu, Dec 7, 2023 at 6:32 AM Xiangying Meng <xi...@apache.org>
> > > wrote:
> > >
> > > > Hi Girish,
> > > >
> > > > What you are actually opposing is the implementation of true
> > topic-level
> > > > geo-replication. You believe that topics should be divided into
> > different
> > > > namespaces based on replication. Following this line of thought, what
> > we
> > > > should do is restrict the current topic-level replication settings,
> not
> > > > allowing the replication clusters set at the topic level to exceed
> the
> > > > range of replication clusters set in the namespace.
> > > >
> > >
> > > Yes, that's my viewpoint. In case that's not your view point, then in
> > your
> > > use cases do you ever have more than one namespace inside a tenant?
> > > With every property coming at topic level, it makes no sense for the
> > > namespace hierarchy to exist anymore.
> > >
> > >
> > > >
> > > > One point that confuses me is that we provide a setting for
> topic-level
> > > > replication clusters, but it can only be used to amend the namespace
> > > > settings and cannot work independently. Isn't this also a poor design
> > for
> > > > Pulsar?
> > > >
> > >
> > > This feature was originally added in pulsar without a PIP. And the PR
> [0]
> > > also doesn't have much context around why it was needed and why it is
> > being
> > > added.. So I can't comment on why this was added..
> > > But my understanding is that even in a situation when the topics are
> > > divided into proper namespaces based on use cases and suddenly there is
> > an
> > > exceptional need for one of the existing topics to have lesser
> > replication,
> > > then instead of following a long exercise of moving that topic to a new
> > > namespace, you can use this feature.
> > >
> > > [0] - https://github.com/apache/pulsar/pull/12136
> > >
> > >
> > >
> > > >
> > > > On Thu, Dec 7, 2023 at 2:28 AM Girish Sharma <
> scrapmachines@gmail.com>
> > > > wrote:
> > > >
> > > > > Hello, replies inline.
> > > > >
> > > > > On Wed, Dec 6, 2023 at 5:28 PM Xiangying Meng <
> xiangying@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Hi Girish,
> > > > > >
> > > > > > Thank you for your explanation. Because Joe's email referenced
> the
> > > > > current
> > > > > > implementation of Pulsar, I misunderstood him to be saying that
> > this
> > > > > > current implementation is not good.
> > > > > >
> > > > > > A possible use case is where there is one or a small number of
> > topics
> > > > in
> > > > > > the namespace that store important messages, which need to be
> > > > replicated
> > > > > to
> > > > > > other clusters. Meanwhile, other topics only need to store data
> in
> > > the
> > > > > > local cluster.
> > > > > >
> > > > >
> > > > > Is it not possible to simply have the other topics in a namespace
> > which
> > > > > allows for that other cluster, and the local topics remain in the
> > > > namespace
> > > > > with local cluster needs. Seems to me like a proper use case of two
> > > > > different namespaces as the use case is different in both cases.
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > For example, only topic1 needs replication, while topic2 to
> > topic100
> > > do
> > > > > > not. According to the current implementation, we need to set
> > > > replication
> > > > > > clusters at the namespace level (e.g. cluster1 and cluster2), and
> > > then
> > > > > set
> > > > > > the topic-level replication clusters (cluster1) for topic2 to
> > > topic100
> > > > to
> > > > > > exclude them. It's hard to say that this is a good design.
> > > > > >
> > > > >
> > > > > No, all you need is to put topic1 in namespace1 and topic2 to
> > topic100
> > > in
> > > > > namespace2 . This is exactly what me and Joe were saying is a bad
> > > design
> > > > > choice that you are clubbing all 100 topics in same namespace.
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > Best regards.
> > > > > >
> > > > > > On Wed, Dec 6, 2023 at 12:49 PM Joe F <jo...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > Girish,
> > > > > > >
> > > > > > > Thank you for making my point much better than I did ..
> > > > > > >
> > > > > > > -Joe
> > > > > > >
> > > > > > > On Tue, Dec 5, 2023 at 1:45 AM Girish Sharma <
> > > > scrapmachines@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hello Xiangying,
> > > > > > > >
> > > > > > > > I believe what Joe here is referring to as "application
> design"
> > > is
> > > > > not
> > > > > > > the
> > > > > > > > design of pulsar or namespace level replication but the
> design
> > of
> > > > > your
> > > > > > > > application and the dependency that you've put on topic level
> > > > > > > replication.
> > > > > > > >
> > > > > > > > In general, I am aligned with Joe from an application design
> > > > > > standpoint.
> > > > > > > A
> > > > > > > > namespace is supposed to represent a single application use
> > case,
> > > > > topic
> > > > > > > > level override of replication clusters helps in cases where
> > there
> > > > > are a
> > > > > > > few
> > > > > > > > exceptional topics which do not need replication in all of
> the
> > > > > > namespace
> > > > > > > > clusters. This helps in saving network bandwidth, storage,
> CPU,
> > > RAM
> > > > > etc
> > > > > > > >
> > > > > > > > But the reason why you've raised this PIP is to bring down
> the
> > > > actual
> > > > > > > > replication semantics at a topic level. Yes, namespace level
> > > still
> > > > > > exists
> > > > > > > > as per your PIP as well, but is basically left only to be a
> > > > "default
> > > > > in
> > > > > > > > case topic level is missing".
> > > > > > > > This brings me to a very basic question - What's the use case
> > > that
> > > > > you
> > > > > > > are
> > > > > > > > trying to solve that needs these changes? Because, then
> what's
> > > > > stopping
> > > > > > > us
> > > > > > > > from bringing every construct that's at a namespace level
> > > > (bundling,
> > > > > > > > hardware affinity, etc) down to a topic level?
> > > > > > > >
> > > > > > > > Regards
> > > > > > > >
> > > > > > > > On Tue, Dec 5, 2023 at 2:52 PM Xiangying Meng <
> > > > xiangying@apache.org>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Joe,
> > > > > > > > >
> > > > > > > > > You're correct. The initial design of the replication
> policy
> > > > leaves
> > > > > > > room
> > > > > > > > > for improvement. To address this, we aim to refine the
> > cluster
> > > > > > settings
> > > > > > > > at
> > > > > > > > > the namespace level in a way that won't impact the existing
> > > > system.
> > > > > > The
> > > > > > > > > replication clusters should solely be used to establish
> full
> > > mesh
> > > > > > > > > replication for that specific namespace, without having any
> > > other
> > > > > > > > > definitions or functionalities.
> > > > > > > > >
> > > > > > > > > BR,
> > > > > > > > > Xiangying
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Dec 4, 2023 at 1:52 PM Joe F <
> joefrancisk@gmail.com>
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > >if users want to change the replication policy for
> > > > > > > > > > topic-n and do not change the replication policy of other
> > > > topics,
> > > > > > > they
> > > > > > > > > need
> > > > > > > > > > to change all the topic policy under this namespace.
> > > > > > > > > >
> > > > > > > > > > This PIP unfortunately  flows from  attempting to solve
> bad
> > > > > > > application
> > > > > > > > > > design
> > > > > > > > > >
> > > > > > > > > > A namespace is supposed to represent an application, and
> > the
> > > > > > > namespace
> > > > > > > > > > policy is an umbrella for a similar set of policies  that
> > > > applies
> > > > > > to
> > > > > > > > all
> > > > > > > > > > topics.  The exceptions would be if a topic had a need
> for
> > a
> > > > > > deficit,
> > > > > > > > The
> > > > > > > > > > case of one topic in the namespace sticking out of the
> > > > namespace
> > > > > > > policy
> > > > > > > > > > umbrella is bad  application design in my opinion
> > > > > > > > > >
> > > > > > > > > > -Joe.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng <
> > > > > > xiangying@apache.org>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Rajan and Girish,
> > > > > > > > > > > Thanks for your reply. About the question you
> mentioned,
> > > > there
> > > > > is
> > > > > > > > some
> > > > > > > > > > > information I want to share with you.
> > > > > > > > > > > >If anyone wants to setup different replication
> clusters
> > > then
> > > > > > > either
> > > > > > > > > > > >those topics can be created under different namespaces
> > or
> > > > > > defined
> > > > > > > at
> > > > > > > > > > topic
> > > > > > > > > > > >level policy.
> > > > > > > > > > >
> > > > > > > > > > > >And users can anyway go and update the namespace's
> > cluster
> > > > > list
> > > > > > to
> > > > > > > > add
> > > > > > > > > > the
> > > > > > > > > > > >missing cluster.
> > > > > > > > > > > Because the replication clusters also mean the clusters
> > > where
> > > > > the
> > > > > > > > topic
> > > > > > > > > > can
> > > > > > > > > > > be created or loaded, the topic-level replication
> > clusters
> > > > can
> > > > > > only
> > > > > > > > be
> > > > > > > > > > the
> > > > > > > > > > > subset of namespace-level replication clusters.
> > > > > > > > > > > Just as Girish mentioned, the users can update the
> > > > namespace's
> > > > > > > > cluster
> > > > > > > > > > list
> > > > > > > > > > > to add the missing cluster.
> > > > > > > > > > > But there is a problem because the replication clusters
> > as
> > > > the
> > > > > > > > > namespace
> > > > > > > > > > > level will create a full mesh replication for that
> > > namespace
> > > > > > across
> > > > > > > > the
> > > > > > > > > > > clusters defined in
> > > > > > > > > > > replication-clusters if users want to change the
> > > replication
> > > > > > policy
> > > > > > > > for
> > > > > > > > > > > topic-n and do not change the replication policy of
> other
> > > > > topics,
> > > > > > > > they
> > > > > > > > > > need
> > > > > > > > > > > to change all the topic policy under this namespace.
> > > > > > > > > > >
> > > > > > > > > > > > Pulsar is being used by many legacy systems and
> > changing
> > > > its
> > > > > > > > > > > >semantics for specific usecases without considering
> > > > > consequences
> > > > > > > are
> > > > > > > > > > > >creating a lot of pain and incompatibility problems
> for
> > > > other
> > > > > > > > existing
> > > > > > > > > > > >systems and let's avoid doing it as we are struggling
> > with
> > > > > such
> > > > > > > > > changes
> > > > > > > > > > > and
> > > > > > > > > > > >breaking compatibility or changing semantics are just
> > not
> > > > > > > > acceptable.
> > > > > > > > > > >
> > > > > > > > > > > This proposal will not introduce an incompatibility
> > > problem,
> > > > > > > because
> > > > > > > > > the
> > > > > > > > > > > default value of the namespace policy of
> allowed-clusters
> > > and
> > > > > > > > > > > topic-policy-synchronized-clusters are the
> > > > > replication-clusters.
> > > > > > > > > > >
> > > > > > > > > > > >Allowed clusters defined at tenant level
> > > > > > > > > > > >will restrict tenants to create namespaces in
> > > > regions/clusters
> > > > > > > where
> > > > > > > > > > they
> > > > > > > > > > > >are not allowed.
> > > > > > > > > > > >As Rajan also mentioned, allowed-clusters field has a
> > > > > different
> > > > > > > > > > > meaning/purpose.
> > > > > > > > > > >
> > > > > > > > > > > Allowed clusters defined at the tenant level will
> > restrict
> > > > > > tenants
> > > > > > > > from
> > > > > > > > > > > creating namespaces in regions/clusters where they are
> > not
> > > > > > allowed.
> > > > > > > > > > > Similarly, the allowed clusters defined at the
> namespace
> > > > level
> > > > > > will
> > > > > > > > > > > restrict the namespace from creating topics in
> > > > regions/clusters
> > > > > > > where
> > > > > > > > > > they
> > > > > > > > > > > are not allowed.
> > > > > > > > > > > What's wrong with this?
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > > Xiangying
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma <
> > > > > > > > scrapmachines@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Xiangying,
> > > > > > > > > > > >
> > > > > > > > > > > > Shouldn't the solution to the issue mentioned in
> #21564
> > > [0]
> > > > > > > mostly
> > > > > > > > be
> > > > > > > > > > > > around validating that topic level replication
> clusters
> > > are
> > > > > > > subset
> > > > > > > > of
> > > > > > > > > > > > namespace level replication clusters?
> > > > > > > > > > > > It would be a completely compatible change as even
> > today
> > > > the
> > > > > > case
> > > > > > > > > > where a
> > > > > > > > > > > > topic has a cluster not defined in namespaces's
> > > > > > > > replication-clusters
> > > > > > > > > > > > doesn't really work.
> > > > > > > > > > > > And users can anyway go and update the namespace's
> > > cluster
> > > > > list
> > > > > > > to
> > > > > > > > > add
> > > > > > > > > > > the
> > > > > > > > > > > > missing cluster.
> > > > > > > > > > > >
> > > > > > > > > > > > As Rajan also mentioned, allowed-clusters field has a
> > > > > different
> > > > > > > > > > > > meaning/purpose.
> > > > > > > > > > > > Regards
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng <
> > > > > > > > > xiangying@apache.org>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi, Pulsar Community
> > > > > > > > > > > > >
> > > > > > > > > > > > > I drafted a proposal to make the configuration of
> > > > clusters
> > > > > at
> > > > > > > the
> > > > > > > > > > > > namespace
> > > > > > > > > > > > > level clearer. This helps solve the problem of
> > > > > > geo-replication
> > > > > > > > not
> > > > > > > > > > > > working
> > > > > > > > > > > > > correctly at the topic level.
> > > > > > > > > > > > >
> > > > > > > > > > > > > https://github.com/apache/pulsar/pull/21648
> > > > > > > > > > > > >
> > > > > > > > > > > > > I'm looking forward to hearing from you.
> > > > > > > > > > > > >
> > > > > > > > > > > > > BR
> > > > > > > > > > > > > Xiangying
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Girish Sharma
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Girish Sharma
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Girish Sharma
> > > > >
> > > >
> > >
> > >
> > > --
> > > Girish Sharma
> > >
> >
>

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by Xiangying Meng <xi...@apache.org>.
Hi Jiwei

Great advice. Thanks for your suggestions and additions.

Thanks,
Xiangying

On Fri, Dec 15, 2023 at 9:41 AM guo jiwei <te...@apache.org> wrote:

> Hi Xiangying,
>    I think  we can rename this PIP to:   *Introduce `allowed-clusters` and
> `topic-policy-synchronized-clusters` to fully support replication on
> message and topic level*
>    Currently, we can set replication clusters on the message and topic
> level, but the replication clusters should be a subset of the namespace
> replication clusters. which means :
>    If we set namespace replication clusters: cluster1, cluster2, cluster3,
> at most, these three or two clusters can be set on message or topic. If the
> user wanna set cluster4 or others, the replication
>    can't work as expected.
>    It's easy to reproduce by this test:
>
> >    @Test
>
>     public void testEnableReplicationInTopicLevel() throws Exception {
>
>         // 1. Create namespace and topic
>
>         String namespace =
> > BrokerTestUtil.newUniqueName("pulsar/testEnableReplicationInTopicLevel");
>
>         String topic1 = "persistent://" + namespace + "/topic-1";
>
>         admin1.namespaces().createNamespace(namespace);
>
>         admin1.topics().createNonPartitionedTopic(topic1);
>
>         // 2. Configure replication clusters for the topic.
>
>         admin1.topics().setReplicationClusters(topic1, List.of("r1",
> "r2"));
>
>         // 3. Check if the replicator connected successfully.
>
>         Awaitility.await().atMost(5, TimeUnit.MINUTES).untilAsserted(() ->
> {
>
>             List<String> keys = pulsar1.getBrokerService()
>
>                     .getTopic(topic1, false).get().get()
>
>                     .getReplicators().keys();
>
>             assertEquals(keys.size(), 1);
>
>             assertTrue(pulsar1.getBrokerService()
>
>                     .getTopic(topic1, false).get().get()
>
>                     .getReplicators().get(keys.get(0)).isConnected());
>
>         });
>
>     }
>
>
>   To fully support the replication, we find out an easy way to solve it.
> Introduce `allowed-clusters` on namespace policies, which Xiangying
> explains above.
>   How could this work and solve the issue? The same example :
>   If we set namespace replication clusters: cluster1, cluster2, cluster3,
> and
>            set topic1 replication clusters: cluster2, cluster4.
>            set topic2 replication clusters: cluster1, cluster4.
>   We must set `allowed-clusters` with cluster1, cluster2, cluster3, and
> cluster4.  The broker side will validate the topic or message replication
> clusters from the `allowed-cluster.`
>   In this way,  we can simplify more codes and logic here.
>   For *`topic-policy-synchronized-clusters` *we also add examples in the
> PIP.
>
>   Hope the explanation could help @Rajan @Girish
>
>
>
>
> Regards
> Jiwei Guo (Tboy)
>
>
> On Thu, Dec 7, 2023 at 10:29 PM Xiangying Meng <xi...@apache.org>
> wrote:
>
> > Hi Girish,
> >
> > I'm very pleased that we have reached some consensus now. Pulsar already
> > supports geo-replication at the topic level, but the existing
> > implementation of this topic level replication does not match our
> > expectations.
> >
> > At the moment, I can think of three directions to solve this problem:
> >
> > 1. Treat this issue as a bug and fix it so that Pulsar can truly support
> > replication at the topic level.
> > 2. Limit the replication topic policy, so that the replication clusters
> at
> > the topic level must be included in the replication clusters configured
> at
> > the namespace level. In this case, the topic level replication would
> serve
> > as a supplement to the namespace replication, rather than a true topic
> > level policy.
> > 3. Remove topic level replication.
> >
> > I lean towards the first option, as it would make Pulsar's replication
> > configuration more flexible and would not break the previous behavior
> > logic.
> >
> > >Yes, that's my viewpoint. In case that's not your view point, then in
> your
> > >use cases do you ever have more than one namespace inside a tenant?
> > >With every property coming at topic level, it makes no sense for the
> > >namespace hierarchy to exist anymore.
> >
> > I didn't propose this from the perspective of a user, but from the
> > perspective of a Pulsar maintainer. The replication cluster at the topic
> > level cannot function independently like other topic policies, and I
> > attempted to fix it after finding the reason.
> >
> > From the user's perspective, I could modify my system to put topics with
> > the same replication strategy under the same namespace. From the
> > maintainer's perspective, if a feature can help users use Pulsar more
> > flexibly and conveniently without introducing risks, then this feature
> > should be implemented. Perhaps business systems do not want to maintain
> too
> > many namespaces, as they would need to configure multiple namespace
> > policies or it might make their business logic complex. The other
> > configurations for topics under this namespace might be consistent, with
> > only a few topics needing to enable replication. In this case, topic
> level
> > replication becomes valuable. Therefore, I lean towards the first option,
> > to solve this problem and make it a truly expected topic policy.
> >
> > On Thu, Dec 7, 2023 at 12:45 PM Girish Sharma <sc...@gmail.com>
> > wrote:
> >
> > > Hello Xiangying,
> > >
> > >
> > > On Thu, Dec 7, 2023 at 6:32 AM Xiangying Meng <xi...@apache.org>
> > > wrote:
> > >
> > > > Hi Girish,
> > > >
> > > > What you are actually opposing is the implementation of true
> > topic-level
> > > > geo-replication. You believe that topics should be divided into
> > different
> > > > namespaces based on replication. Following this line of thought, what
> > we
> > > > should do is restrict the current topic-level replication settings,
> not
> > > > allowing the replication clusters set at the topic level to exceed
> the
> > > > range of replication clusters set in the namespace.
> > > >
> > >
> > > Yes, that's my viewpoint. In case that's not your view point, then in
> > your
> > > use cases do you ever have more than one namespace inside a tenant?
> > > With every property coming at topic level, it makes no sense for the
> > > namespace hierarchy to exist anymore.
> > >
> > >
> > > >
> > > > One point that confuses me is that we provide a setting for
> topic-level
> > > > replication clusters, but it can only be used to amend the namespace
> > > > settings and cannot work independently. Isn't this also a poor design
> > for
> > > > Pulsar?
> > > >
> > >
> > > This feature was originally added in pulsar without a PIP. And the PR
> [0]
> > > also doesn't have much context around why it was needed and why it is
> > being
> > > added.. So I can't comment on why this was added..
> > > But my understanding is that even in a situation when the topics are
> > > divided into proper namespaces based on use cases and suddenly there is
> > an
> > > exceptional need for one of the existing topics to have lesser
> > replication,
> > > then instead of following a long exercise of moving that topic to a new
> > > namespace, you can use this feature.
> > >
> > > [0] - https://github.com/apache/pulsar/pull/12136
> > >
> > >
> > >
> > > >
> > > > On Thu, Dec 7, 2023 at 2:28 AM Girish Sharma <
> scrapmachines@gmail.com>
> > > > wrote:
> > > >
> > > > > Hello, replies inline.
> > > > >
> > > > > On Wed, Dec 6, 2023 at 5:28 PM Xiangying Meng <
> xiangying@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Hi Girish,
> > > > > >
> > > > > > Thank you for your explanation. Because Joe's email referenced
> the
> > > > > current
> > > > > > implementation of Pulsar, I misunderstood him to be saying that
> > this
> > > > > > current implementation is not good.
> > > > > >
> > > > > > A possible use case is where there is one or a small number of
> > topics
> > > > in
> > > > > > the namespace that store important messages, which need to be
> > > > replicated
> > > > > to
> > > > > > other clusters. Meanwhile, other topics only need to store data
> in
> > > the
> > > > > > local cluster.
> > > > > >
> > > > >
> > > > > Is it not possible to simply have the other topics in a namespace
> > which
> > > > > allows for that other cluster, and the local topics remain in the
> > > > namespace
> > > > > with local cluster needs. Seems to me like a proper use case of two
> > > > > different namespaces as the use case is different in both cases.
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > For example, only topic1 needs replication, while topic2 to
> > topic100
> > > do
> > > > > > not. According to the current implementation, we need to set
> > > > replication
> > > > > > clusters at the namespace level (e.g. cluster1 and cluster2), and
> > > then
> > > > > set
> > > > > > the topic-level replication clusters (cluster1) for topic2 to
> > > topic100
> > > > to
> > > > > > exclude them. It's hard to say that this is a good design.
> > > > > >
> > > > >
> > > > > No, all you need is to put topic1 in namespace1 and topic2 to
> > topic100
> > > in
> > > > > namespace2 . This is exactly what me and Joe were saying is a bad
> > > design
> > > > > choice that you are clubbing all 100 topics in same namespace.
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > Best regards.
> > > > > >
> > > > > > On Wed, Dec 6, 2023 at 12:49 PM Joe F <jo...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > Girish,
> > > > > > >
> > > > > > > Thank you for making my point much better than I did ..
> > > > > > >
> > > > > > > -Joe
> > > > > > >
> > > > > > > On Tue, Dec 5, 2023 at 1:45 AM Girish Sharma <
> > > > scrapmachines@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hello Xiangying,
> > > > > > > >
> > > > > > > > I believe what Joe here is referring to as "application
> design"
> > > is
> > > > > not
> > > > > > > the
> > > > > > > > design of pulsar or namespace level replication but the
> design
> > of
> > > > > your
> > > > > > > > application and the dependency that you've put on topic level
> > > > > > > replication.
> > > > > > > >
> > > > > > > > In general, I am aligned with Joe from an application design
> > > > > > standpoint.
> > > > > > > A
> > > > > > > > namespace is supposed to represent a single application use
> > case,
> > > > > topic
> > > > > > > > level override of replication clusters helps in cases where
> > there
> > > > > are a
> > > > > > > few
> > > > > > > > exceptional topics which do not need replication in all of
> the
> > > > > > namespace
> > > > > > > > clusters. This helps in saving network bandwidth, storage,
> CPU,
> > > RAM
> > > > > etc
> > > > > > > >
> > > > > > > > But the reason why you've raised this PIP is to bring down
> the
> > > > actual
> > > > > > > > replication semantics at a topic level. Yes, namespace level
> > > still
> > > > > > exists
> > > > > > > > as per your PIP as well, but is basically left only to be a
> > > > "default
> > > > > in
> > > > > > > > case topic level is missing".
> > > > > > > > This brings me to a very basic question - What's the use case
> > > that
> > > > > you
> > > > > > > are
> > > > > > > > trying to solve that needs these changes? Because, then
> what's
> > > > > stopping
> > > > > > > us
> > > > > > > > from bringing every construct that's at a namespace level
> > > > (bundling,
> > > > > > > > hardware affinity, etc) down to a topic level?
> > > > > > > >
> > > > > > > > Regards
> > > > > > > >
> > > > > > > > On Tue, Dec 5, 2023 at 2:52 PM Xiangying Meng <
> > > > xiangying@apache.org>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Joe,
> > > > > > > > >
> > > > > > > > > You're correct. The initial design of the replication
> policy
> > > > leaves
> > > > > > > room
> > > > > > > > > for improvement. To address this, we aim to refine the
> > cluster
> > > > > > settings
> > > > > > > > at
> > > > > > > > > the namespace level in a way that won't impact the existing
> > > > system.
> > > > > > The
> > > > > > > > > replication clusters should solely be used to establish
> full
> > > mesh
> > > > > > > > > replication for that specific namespace, without having any
> > > other
> > > > > > > > > definitions or functionalities.
> > > > > > > > >
> > > > > > > > > BR,
> > > > > > > > > Xiangying
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Dec 4, 2023 at 1:52 PM Joe F <
> joefrancisk@gmail.com>
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > >if users want to change the replication policy for
> > > > > > > > > > topic-n and do not change the replication policy of other
> > > > topics,
> > > > > > > they
> > > > > > > > > need
> > > > > > > > > > to change all the topic policy under this namespace.
> > > > > > > > > >
> > > > > > > > > > This PIP unfortunately  flows from  attempting to solve
> bad
> > > > > > > application
> > > > > > > > > > design
> > > > > > > > > >
> > > > > > > > > > A namespace is supposed to represent an application, and
> > the
> > > > > > > namespace
> > > > > > > > > > policy is an umbrella for a similar set of policies  that
> > > > applies
> > > > > > to
> > > > > > > > all
> > > > > > > > > > topics.  The exceptions would be if a topic had a need
> for
> > a
> > > > > > deficit,
> > > > > > > > The
> > > > > > > > > > case of one topic in the namespace sticking out of the
> > > > namespace
> > > > > > > policy
> > > > > > > > > > umbrella is bad  application design in my opinion
> > > > > > > > > >
> > > > > > > > > > -Joe.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng <
> > > > > > xiangying@apache.org>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Rajan and Girish,
> > > > > > > > > > > Thanks for your reply. About the question you
> mentioned,
> > > > there
> > > > > is
> > > > > > > > some
> > > > > > > > > > > information I want to share with you.
> > > > > > > > > > > >If anyone wants to setup different replication
> clusters
> > > then
> > > > > > > either
> > > > > > > > > > > >those topics can be created under different namespaces
> > or
> > > > > > defined
> > > > > > > at
> > > > > > > > > > topic
> > > > > > > > > > > >level policy.
> > > > > > > > > > >
> > > > > > > > > > > >And users can anyway go and update the namespace's
> > cluster
> > > > > list
> > > > > > to
> > > > > > > > add
> > > > > > > > > > the
> > > > > > > > > > > >missing cluster.
> > > > > > > > > > > Because the replication clusters also mean the clusters
> > > where
> > > > > the
> > > > > > > > topic
> > > > > > > > > > can
> > > > > > > > > > > be created or loaded, the topic-level replication
> > clusters
> > > > can
> > > > > > only
> > > > > > > > be
> > > > > > > > > > the
> > > > > > > > > > > subset of namespace-level replication clusters.
> > > > > > > > > > > Just as Girish mentioned, the users can update the
> > > > namespace's
> > > > > > > > cluster
> > > > > > > > > > list
> > > > > > > > > > > to add the missing cluster.
> > > > > > > > > > > But there is a problem because the replication clusters
> > as
> > > > the
> > > > > > > > > namespace
> > > > > > > > > > > level will create a full mesh replication for that
> > > namespace
> > > > > > across
> > > > > > > > the
> > > > > > > > > > > clusters defined in
> > > > > > > > > > > replication-clusters if users want to change the
> > > replication
> > > > > > policy
> > > > > > > > for
> > > > > > > > > > > topic-n and do not change the replication policy of
> other
> > > > > topics,
> > > > > > > > they
> > > > > > > > > > need
> > > > > > > > > > > to change all the topic policy under this namespace.
> > > > > > > > > > >
> > > > > > > > > > > > Pulsar is being used by many legacy systems and
> > changing
> > > > its
> > > > > > > > > > > >semantics for specific usecases without considering
> > > > > consequences
> > > > > > > are
> > > > > > > > > > > >creating a lot of pain and incompatibility problems
> for
> > > > other
> > > > > > > > existing
> > > > > > > > > > > >systems and let's avoid doing it as we are struggling
> > with
> > > > > such
> > > > > > > > > changes
> > > > > > > > > > > and
> > > > > > > > > > > >breaking compatibility or changing semantics are just
> > not
> > > > > > > > acceptable.
> > > > > > > > > > >
> > > > > > > > > > > This proposal will not introduce an incompatibility
> > > problem,
> > > > > > > because
> > > > > > > > > the
> > > > > > > > > > > default value of the namespace policy of
> allowed-clusters
> > > and
> > > > > > > > > > > topic-policy-synchronized-clusters are the
> > > > > replication-clusters.
> > > > > > > > > > >
> > > > > > > > > > > >Allowed clusters defined at tenant level
> > > > > > > > > > > >will restrict tenants to create namespaces in
> > > > regions/clusters
> > > > > > > where
> > > > > > > > > > they
> > > > > > > > > > > >are not allowed.
> > > > > > > > > > > >As Rajan also mentioned, allowed-clusters field has a
> > > > > different
> > > > > > > > > > > meaning/purpose.
> > > > > > > > > > >
> > > > > > > > > > > Allowed clusters defined at the tenant level will
> > restrict
> > > > > > tenants
> > > > > > > > from
> > > > > > > > > > > creating namespaces in regions/clusters where they are
> > not
> > > > > > allowed.
> > > > > > > > > > > Similarly, the allowed clusters defined at the
> namespace
> > > > level
> > > > > > will
> > > > > > > > > > > restrict the namespace from creating topics in
> > > > regions/clusters
> > > > > > > where
> > > > > > > > > > they
> > > > > > > > > > > are not allowed.
> > > > > > > > > > > What's wrong with this?
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > > Xiangying
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma <
> > > > > > > > scrapmachines@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Xiangying,
> > > > > > > > > > > >
> > > > > > > > > > > > Shouldn't the solution to the issue mentioned in
> #21564
> > > [0]
> > > > > > > mostly
> > > > > > > > be
> > > > > > > > > > > > around validating that topic level replication
> clusters
> > > are
> > > > > > > subset
> > > > > > > > of
> > > > > > > > > > > > namespace level replication clusters?
> > > > > > > > > > > > It would be a completely compatible change as even
> > today
> > > > the
> > > > > > case
> > > > > > > > > > where a
> > > > > > > > > > > > topic has a cluster not defined in namespaces's
> > > > > > > > replication-clusters
> > > > > > > > > > > > doesn't really work.
> > > > > > > > > > > > And users can anyway go and update the namespace's
> > > cluster
> > > > > list
> > > > > > > to
> > > > > > > > > add
> > > > > > > > > > > the
> > > > > > > > > > > > missing cluster.
> > > > > > > > > > > >
> > > > > > > > > > > > As Rajan also mentioned, allowed-clusters field has a
> > > > > different
> > > > > > > > > > > > meaning/purpose.
> > > > > > > > > > > > Regards
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng <
> > > > > > > > > xiangying@apache.org>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi, Pulsar Community
> > > > > > > > > > > > >
> > > > > > > > > > > > > I drafted a proposal to make the configuration of
> > > > clusters
> > > > > at
> > > > > > > the
> > > > > > > > > > > > namespace
> > > > > > > > > > > > > level clearer. This helps solve the problem of
> > > > > > geo-replication
> > > > > > > > not
> > > > > > > > > > > > working
> > > > > > > > > > > > > correctly at the topic level.
> > > > > > > > > > > > >
> > > > > > > > > > > > > https://github.com/apache/pulsar/pull/21648
> > > > > > > > > > > > >
> > > > > > > > > > > > > I'm looking forward to hearing from you.
> > > > > > > > > > > > >
> > > > > > > > > > > > > BR
> > > > > > > > > > > > > Xiangying
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Girish Sharma
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Girish Sharma
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Girish Sharma
> > > > >
> > > >
> > >
> > >
> > > --
> > > Girish Sharma
> > >
> >
>

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by guo jiwei <te...@apache.org>.
Hi Xiangying,
   I think  we can rename this PIP to:   *Introduce `allowed-clusters` and
`topic-policy-synchronized-clusters` to fully support replication on
message and topic level*
   Currently, we can set replication clusters on the message and topic
level, but the replication clusters should be a subset of the namespace
replication clusters. which means :
   If we set namespace replication clusters: cluster1, cluster2, cluster3,
at most, these three or two clusters can be set on message or topic. If the
user wanna set cluster4 or others, the replication
   can't work as expected.
   It's easy to reproduce by this test:

>    @Test

    public void testEnableReplicationInTopicLevel() throws Exception {

        // 1. Create namespace and topic

        String namespace =
> BrokerTestUtil.newUniqueName("pulsar/testEnableReplicationInTopicLevel");

        String topic1 = "persistent://" + namespace + "/topic-1";

        admin1.namespaces().createNamespace(namespace);

        admin1.topics().createNonPartitionedTopic(topic1);

        // 2. Configure replication clusters for the topic.

        admin1.topics().setReplicationClusters(topic1, List.of("r1", "r2"));

        // 3. Check if the replicator connected successfully.

        Awaitility.await().atMost(5, TimeUnit.MINUTES).untilAsserted(() -> {

            List<String> keys = pulsar1.getBrokerService()

                    .getTopic(topic1, false).get().get()

                    .getReplicators().keys();

            assertEquals(keys.size(), 1);

            assertTrue(pulsar1.getBrokerService()

                    .getTopic(topic1, false).get().get()

                    .getReplicators().get(keys.get(0)).isConnected());

        });

    }


  To fully support the replication, we find out an easy way to solve it.
Introduce `allowed-clusters` on namespace policies, which Xiangying
explains above.
  How could this work and solve the issue? The same example :
  If we set namespace replication clusters: cluster1, cluster2, cluster3,
and
           set topic1 replication clusters: cluster2, cluster4.
           set topic2 replication clusters: cluster1, cluster4.
  We must set `allowed-clusters` with cluster1, cluster2, cluster3, and
cluster4.  The broker side will validate the topic or message replication
clusters from the `allowed-cluster.`
  In this way,  we can simplify more codes and logic here.
  For *`topic-policy-synchronized-clusters` *we also add examples in the
PIP.

  Hope the explanation could help @Rajan @Girish




Regards
Jiwei Guo (Tboy)


On Thu, Dec 7, 2023 at 10:29 PM Xiangying Meng <xi...@apache.org> wrote:

> Hi Girish,
>
> I'm very pleased that we have reached some consensus now. Pulsar already
> supports geo-replication at the topic level, but the existing
> implementation of this topic level replication does not match our
> expectations.
>
> At the moment, I can think of three directions to solve this problem:
>
> 1. Treat this issue as a bug and fix it so that Pulsar can truly support
> replication at the topic level.
> 2. Limit the replication topic policy, so that the replication clusters at
> the topic level must be included in the replication clusters configured at
> the namespace level. In this case, the topic level replication would serve
> as a supplement to the namespace replication, rather than a true topic
> level policy.
> 3. Remove topic level replication.
>
> I lean towards the first option, as it would make Pulsar's replication
> configuration more flexible and would not break the previous behavior
> logic.
>
> >Yes, that's my viewpoint. In case that's not your view point, then in your
> >use cases do you ever have more than one namespace inside a tenant?
> >With every property coming at topic level, it makes no sense for the
> >namespace hierarchy to exist anymore.
>
> I didn't propose this from the perspective of a user, but from the
> perspective of a Pulsar maintainer. The replication cluster at the topic
> level cannot function independently like other topic policies, and I
> attempted to fix it after finding the reason.
>
> From the user's perspective, I could modify my system to put topics with
> the same replication strategy under the same namespace. From the
> maintainer's perspective, if a feature can help users use Pulsar more
> flexibly and conveniently without introducing risks, then this feature
> should be implemented. Perhaps business systems do not want to maintain too
> many namespaces, as they would need to configure multiple namespace
> policies or it might make their business logic complex. The other
> configurations for topics under this namespace might be consistent, with
> only a few topics needing to enable replication. In this case, topic level
> replication becomes valuable. Therefore, I lean towards the first option,
> to solve this problem and make it a truly expected topic policy.
>
> On Thu, Dec 7, 2023 at 12:45 PM Girish Sharma <sc...@gmail.com>
> wrote:
>
> > Hello Xiangying,
> >
> >
> > On Thu, Dec 7, 2023 at 6:32 AM Xiangying Meng <xi...@apache.org>
> > wrote:
> >
> > > Hi Girish,
> > >
> > > What you are actually opposing is the implementation of true
> topic-level
> > > geo-replication. You believe that topics should be divided into
> different
> > > namespaces based on replication. Following this line of thought, what
> we
> > > should do is restrict the current topic-level replication settings, not
> > > allowing the replication clusters set at the topic level to exceed the
> > > range of replication clusters set in the namespace.
> > >
> >
> > Yes, that's my viewpoint. In case that's not your view point, then in
> your
> > use cases do you ever have more than one namespace inside a tenant?
> > With every property coming at topic level, it makes no sense for the
> > namespace hierarchy to exist anymore.
> >
> >
> > >
> > > One point that confuses me is that we provide a setting for topic-level
> > > replication clusters, but it can only be used to amend the namespace
> > > settings and cannot work independently. Isn't this also a poor design
> for
> > > Pulsar?
> > >
> >
> > This feature was originally added in pulsar without a PIP. And the PR [0]
> > also doesn't have much context around why it was needed and why it is
> being
> > added.. So I can't comment on why this was added..
> > But my understanding is that even in a situation when the topics are
> > divided into proper namespaces based on use cases and suddenly there is
> an
> > exceptional need for one of the existing topics to have lesser
> replication,
> > then instead of following a long exercise of moving that topic to a new
> > namespace, you can use this feature.
> >
> > [0] - https://github.com/apache/pulsar/pull/12136
> >
> >
> >
> > >
> > > On Thu, Dec 7, 2023 at 2:28 AM Girish Sharma <sc...@gmail.com>
> > > wrote:
> > >
> > > > Hello, replies inline.
> > > >
> > > > On Wed, Dec 6, 2023 at 5:28 PM Xiangying Meng <xi...@apache.org>
> > > > wrote:
> > > >
> > > > > Hi Girish,
> > > > >
> > > > > Thank you for your explanation. Because Joe's email referenced the
> > > > current
> > > > > implementation of Pulsar, I misunderstood him to be saying that
> this
> > > > > current implementation is not good.
> > > > >
> > > > > A possible use case is where there is one or a small number of
> topics
> > > in
> > > > > the namespace that store important messages, which need to be
> > > replicated
> > > > to
> > > > > other clusters. Meanwhile, other topics only need to store data in
> > the
> > > > > local cluster.
> > > > >
> > > >
> > > > Is it not possible to simply have the other topics in a namespace
> which
> > > > allows for that other cluster, and the local topics remain in the
> > > namespace
> > > > with local cluster needs. Seems to me like a proper use case of two
> > > > different namespaces as the use case is different in both cases.
> > > >
> > > >
> > > >
> > > > >
> > > > > For example, only topic1 needs replication, while topic2 to
> topic100
> > do
> > > > > not. According to the current implementation, we need to set
> > > replication
> > > > > clusters at the namespace level (e.g. cluster1 and cluster2), and
> > then
> > > > set
> > > > > the topic-level replication clusters (cluster1) for topic2 to
> > topic100
> > > to
> > > > > exclude them. It's hard to say that this is a good design.
> > > > >
> > > >
> > > > No, all you need is to put topic1 in namespace1 and topic2 to
> topic100
> > in
> > > > namespace2 . This is exactly what me and Joe were saying is a bad
> > design
> > > > choice that you are clubbing all 100 topics in same namespace.
> > > >
> > > >
> > > >
> > > > >
> > > > > Best regards.
> > > > >
> > > > > On Wed, Dec 6, 2023 at 12:49 PM Joe F <jo...@gmail.com>
> wrote:
> > > > >
> > > > > > Girish,
> > > > > >
> > > > > > Thank you for making my point much better than I did ..
> > > > > >
> > > > > > -Joe
> > > > > >
> > > > > > On Tue, Dec 5, 2023 at 1:45 AM Girish Sharma <
> > > scrapmachines@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hello Xiangying,
> > > > > > >
> > > > > > > I believe what Joe here is referring to as "application design"
> > is
> > > > not
> > > > > > the
> > > > > > > design of pulsar or namespace level replication but the design
> of
> > > > your
> > > > > > > application and the dependency that you've put on topic level
> > > > > > replication.
> > > > > > >
> > > > > > > In general, I am aligned with Joe from an application design
> > > > > standpoint.
> > > > > > A
> > > > > > > namespace is supposed to represent a single application use
> case,
> > > > topic
> > > > > > > level override of replication clusters helps in cases where
> there
> > > > are a
> > > > > > few
> > > > > > > exceptional topics which do not need replication in all of the
> > > > > namespace
> > > > > > > clusters. This helps in saving network bandwidth, storage, CPU,
> > RAM
> > > > etc
> > > > > > >
> > > > > > > But the reason why you've raised this PIP is to bring down the
> > > actual
> > > > > > > replication semantics at a topic level. Yes, namespace level
> > still
> > > > > exists
> > > > > > > as per your PIP as well, but is basically left only to be a
> > > "default
> > > > in
> > > > > > > case topic level is missing".
> > > > > > > This brings me to a very basic question - What's the use case
> > that
> > > > you
> > > > > > are
> > > > > > > trying to solve that needs these changes? Because, then what's
> > > > stopping
> > > > > > us
> > > > > > > from bringing every construct that's at a namespace level
> > > (bundling,
> > > > > > > hardware affinity, etc) down to a topic level?
> > > > > > >
> > > > > > > Regards
> > > > > > >
> > > > > > > On Tue, Dec 5, 2023 at 2:52 PM Xiangying Meng <
> > > xiangying@apache.org>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Joe,
> > > > > > > >
> > > > > > > > You're correct. The initial design of the replication policy
> > > leaves
> > > > > > room
> > > > > > > > for improvement. To address this, we aim to refine the
> cluster
> > > > > settings
> > > > > > > at
> > > > > > > > the namespace level in a way that won't impact the existing
> > > system.
> > > > > The
> > > > > > > > replication clusters should solely be used to establish full
> > mesh
> > > > > > > > replication for that specific namespace, without having any
> > other
> > > > > > > > definitions or functionalities.
> > > > > > > >
> > > > > > > > BR,
> > > > > > > > Xiangying
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Dec 4, 2023 at 1:52 PM Joe F <jo...@gmail.com>
> > > > wrote:
> > > > > > > >
> > > > > > > > > >if users want to change the replication policy for
> > > > > > > > > topic-n and do not change the replication policy of other
> > > topics,
> > > > > > they
> > > > > > > > need
> > > > > > > > > to change all the topic policy under this namespace.
> > > > > > > > >
> > > > > > > > > This PIP unfortunately  flows from  attempting to solve bad
> > > > > > application
> > > > > > > > > design
> > > > > > > > >
> > > > > > > > > A namespace is supposed to represent an application, and
> the
> > > > > > namespace
> > > > > > > > > policy is an umbrella for a similar set of policies  that
> > > applies
> > > > > to
> > > > > > > all
> > > > > > > > > topics.  The exceptions would be if a topic had a need for
> a
> > > > > deficit,
> > > > > > > The
> > > > > > > > > case of one topic in the namespace sticking out of the
> > > namespace
> > > > > > policy
> > > > > > > > > umbrella is bad  application design in my opinion
> > > > > > > > >
> > > > > > > > > -Joe.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng <
> > > > > xiangying@apache.org>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Rajan and Girish,
> > > > > > > > > > Thanks for your reply. About the question you mentioned,
> > > there
> > > > is
> > > > > > > some
> > > > > > > > > > information I want to share with you.
> > > > > > > > > > >If anyone wants to setup different replication clusters
> > then
> > > > > > either
> > > > > > > > > > >those topics can be created under different namespaces
> or
> > > > > defined
> > > > > > at
> > > > > > > > > topic
> > > > > > > > > > >level policy.
> > > > > > > > > >
> > > > > > > > > > >And users can anyway go and update the namespace's
> cluster
> > > > list
> > > > > to
> > > > > > > add
> > > > > > > > > the
> > > > > > > > > > >missing cluster.
> > > > > > > > > > Because the replication clusters also mean the clusters
> > where
> > > > the
> > > > > > > topic
> > > > > > > > > can
> > > > > > > > > > be created or loaded, the topic-level replication
> clusters
> > > can
> > > > > only
> > > > > > > be
> > > > > > > > > the
> > > > > > > > > > subset of namespace-level replication clusters.
> > > > > > > > > > Just as Girish mentioned, the users can update the
> > > namespace's
> > > > > > > cluster
> > > > > > > > > list
> > > > > > > > > > to add the missing cluster.
> > > > > > > > > > But there is a problem because the replication clusters
> as
> > > the
> > > > > > > > namespace
> > > > > > > > > > level will create a full mesh replication for that
> > namespace
> > > > > across
> > > > > > > the
> > > > > > > > > > clusters defined in
> > > > > > > > > > replication-clusters if users want to change the
> > replication
> > > > > policy
> > > > > > > for
> > > > > > > > > > topic-n and do not change the replication policy of other
> > > > topics,
> > > > > > > they
> > > > > > > > > need
> > > > > > > > > > to change all the topic policy under this namespace.
> > > > > > > > > >
> > > > > > > > > > > Pulsar is being used by many legacy systems and
> changing
> > > its
> > > > > > > > > > >semantics for specific usecases without considering
> > > > consequences
> > > > > > are
> > > > > > > > > > >creating a lot of pain and incompatibility problems for
> > > other
> > > > > > > existing
> > > > > > > > > > >systems and let's avoid doing it as we are struggling
> with
> > > > such
> > > > > > > > changes
> > > > > > > > > > and
> > > > > > > > > > >breaking compatibility or changing semantics are just
> not
> > > > > > > acceptable.
> > > > > > > > > >
> > > > > > > > > > This proposal will not introduce an incompatibility
> > problem,
> > > > > > because
> > > > > > > > the
> > > > > > > > > > default value of the namespace policy of allowed-clusters
> > and
> > > > > > > > > > topic-policy-synchronized-clusters are the
> > > > replication-clusters.
> > > > > > > > > >
> > > > > > > > > > >Allowed clusters defined at tenant level
> > > > > > > > > > >will restrict tenants to create namespaces in
> > > regions/clusters
> > > > > > where
> > > > > > > > > they
> > > > > > > > > > >are not allowed.
> > > > > > > > > > >As Rajan also mentioned, allowed-clusters field has a
> > > > different
> > > > > > > > > > meaning/purpose.
> > > > > > > > > >
> > > > > > > > > > Allowed clusters defined at the tenant level will
> restrict
> > > > > tenants
> > > > > > > from
> > > > > > > > > > creating namespaces in regions/clusters where they are
> not
> > > > > allowed.
> > > > > > > > > > Similarly, the allowed clusters defined at the namespace
> > > level
> > > > > will
> > > > > > > > > > restrict the namespace from creating topics in
> > > regions/clusters
> > > > > > where
> > > > > > > > > they
> > > > > > > > > > are not allowed.
> > > > > > > > > > What's wrong with this?
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > > Xiangying
> > > > > > > > > >
> > > > > > > > > > On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma <
> > > > > > > scrapmachines@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Xiangying,
> > > > > > > > > > >
> > > > > > > > > > > Shouldn't the solution to the issue mentioned in #21564
> > [0]
> > > > > > mostly
> > > > > > > be
> > > > > > > > > > > around validating that topic level replication clusters
> > are
> > > > > > subset
> > > > > > > of
> > > > > > > > > > > namespace level replication clusters?
> > > > > > > > > > > It would be a completely compatible change as even
> today
> > > the
> > > > > case
> > > > > > > > > where a
> > > > > > > > > > > topic has a cluster not defined in namespaces's
> > > > > > > replication-clusters
> > > > > > > > > > > doesn't really work.
> > > > > > > > > > > And users can anyway go and update the namespace's
> > cluster
> > > > list
> > > > > > to
> > > > > > > > add
> > > > > > > > > > the
> > > > > > > > > > > missing cluster.
> > > > > > > > > > >
> > > > > > > > > > > As Rajan also mentioned, allowed-clusters field has a
> > > > different
> > > > > > > > > > > meaning/purpose.
> > > > > > > > > > > Regards
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng <
> > > > > > > > xiangying@apache.org>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi, Pulsar Community
> > > > > > > > > > > >
> > > > > > > > > > > > I drafted a proposal to make the configuration of
> > > clusters
> > > > at
> > > > > > the
> > > > > > > > > > > namespace
> > > > > > > > > > > > level clearer. This helps solve the problem of
> > > > > geo-replication
> > > > > > > not
> > > > > > > > > > > working
> > > > > > > > > > > > correctly at the topic level.
> > > > > > > > > > > >
> > > > > > > > > > > > https://github.com/apache/pulsar/pull/21648
> > > > > > > > > > > >
> > > > > > > > > > > > I'm looking forward to hearing from you.
> > > > > > > > > > > >
> > > > > > > > > > > > BR
> > > > > > > > > > > > Xiangying
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Girish Sharma
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Girish Sharma
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Girish Sharma
> > > >
> > >
> >
> >
> > --
> > Girish Sharma
> >
>

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by Xiangying Meng <xi...@apache.org>.
Hi Girish,

I'm very pleased that we have reached some consensus now. Pulsar already
supports geo-replication at the topic level, but the existing
implementation of this topic level replication does not match our
expectations.

At the moment, I can think of three directions to solve this problem:

1. Treat this issue as a bug and fix it so that Pulsar can truly support
replication at the topic level.
2. Limit the replication topic policy, so that the replication clusters at
the topic level must be included in the replication clusters configured at
the namespace level. In this case, the topic level replication would serve
as a supplement to the namespace replication, rather than a true topic
level policy.
3. Remove topic level replication.

I lean towards the first option, as it would make Pulsar's replication
configuration more flexible and would not break the previous behavior logic.

>Yes, that's my viewpoint. In case that's not your view point, then in your
>use cases do you ever have more than one namespace inside a tenant?
>With every property coming at topic level, it makes no sense for the
>namespace hierarchy to exist anymore.

I didn't propose this from the perspective of a user, but from the
perspective of a Pulsar maintainer. The replication cluster at the topic
level cannot function independently like other topic policies, and I
attempted to fix it after finding the reason.

From the user's perspective, I could modify my system to put topics with
the same replication strategy under the same namespace. From the
maintainer's perspective, if a feature can help users use Pulsar more
flexibly and conveniently without introducing risks, then this feature
should be implemented. Perhaps business systems do not want to maintain too
many namespaces, as they would need to configure multiple namespace
policies or it might make their business logic complex. The other
configurations for topics under this namespace might be consistent, with
only a few topics needing to enable replication. In this case, topic level
replication becomes valuable. Therefore, I lean towards the first option,
to solve this problem and make it a truly expected topic policy.

On Thu, Dec 7, 2023 at 12:45 PM Girish Sharma <sc...@gmail.com>
wrote:

> Hello Xiangying,
>
>
> On Thu, Dec 7, 2023 at 6:32 AM Xiangying Meng <xi...@apache.org>
> wrote:
>
> > Hi Girish,
> >
> > What you are actually opposing is the implementation of true topic-level
> > geo-replication. You believe that topics should be divided into different
> > namespaces based on replication. Following this line of thought, what we
> > should do is restrict the current topic-level replication settings, not
> > allowing the replication clusters set at the topic level to exceed the
> > range of replication clusters set in the namespace.
> >
>
> Yes, that's my viewpoint. In case that's not your view point, then in your
> use cases do you ever have more than one namespace inside a tenant?
> With every property coming at topic level, it makes no sense for the
> namespace hierarchy to exist anymore.
>
>
> >
> > One point that confuses me is that we provide a setting for topic-level
> > replication clusters, but it can only be used to amend the namespace
> > settings and cannot work independently. Isn't this also a poor design for
> > Pulsar?
> >
>
> This feature was originally added in pulsar without a PIP. And the PR [0]
> also doesn't have much context around why it was needed and why it is being
> added.. So I can't comment on why this was added..
> But my understanding is that even in a situation when the topics are
> divided into proper namespaces based on use cases and suddenly there is an
> exceptional need for one of the existing topics to have lesser replication,
> then instead of following a long exercise of moving that topic to a new
> namespace, you can use this feature.
>
> [0] - https://github.com/apache/pulsar/pull/12136
>
>
>
> >
> > On Thu, Dec 7, 2023 at 2:28 AM Girish Sharma <sc...@gmail.com>
> > wrote:
> >
> > > Hello, replies inline.
> > >
> > > On Wed, Dec 6, 2023 at 5:28 PM Xiangying Meng <xi...@apache.org>
> > > wrote:
> > >
> > > > Hi Girish,
> > > >
> > > > Thank you for your explanation. Because Joe's email referenced the
> > > current
> > > > implementation of Pulsar, I misunderstood him to be saying that this
> > > > current implementation is not good.
> > > >
> > > > A possible use case is where there is one or a small number of topics
> > in
> > > > the namespace that store important messages, which need to be
> > replicated
> > > to
> > > > other clusters. Meanwhile, other topics only need to store data in
> the
> > > > local cluster.
> > > >
> > >
> > > Is it not possible to simply have the other topics in a namespace which
> > > allows for that other cluster, and the local topics remain in the
> > namespace
> > > with local cluster needs. Seems to me like a proper use case of two
> > > different namespaces as the use case is different in both cases.
> > >
> > >
> > >
> > > >
> > > > For example, only topic1 needs replication, while topic2 to topic100
> do
> > > > not. According to the current implementation, we need to set
> > replication
> > > > clusters at the namespace level (e.g. cluster1 and cluster2), and
> then
> > > set
> > > > the topic-level replication clusters (cluster1) for topic2 to
> topic100
> > to
> > > > exclude them. It's hard to say that this is a good design.
> > > >
> > >
> > > No, all you need is to put topic1 in namespace1 and topic2 to topic100
> in
> > > namespace2 . This is exactly what me and Joe were saying is a bad
> design
> > > choice that you are clubbing all 100 topics in same namespace.
> > >
> > >
> > >
> > > >
> > > > Best regards.
> > > >
> > > > On Wed, Dec 6, 2023 at 12:49 PM Joe F <jo...@gmail.com> wrote:
> > > >
> > > > > Girish,
> > > > >
> > > > > Thank you for making my point much better than I did ..
> > > > >
> > > > > -Joe
> > > > >
> > > > > On Tue, Dec 5, 2023 at 1:45 AM Girish Sharma <
> > scrapmachines@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hello Xiangying,
> > > > > >
> > > > > > I believe what Joe here is referring to as "application design"
> is
> > > not
> > > > > the
> > > > > > design of pulsar or namespace level replication but the design of
> > > your
> > > > > > application and the dependency that you've put on topic level
> > > > > replication.
> > > > > >
> > > > > > In general, I am aligned with Joe from an application design
> > > > standpoint.
> > > > > A
> > > > > > namespace is supposed to represent a single application use case,
> > > topic
> > > > > > level override of replication clusters helps in cases where there
> > > are a
> > > > > few
> > > > > > exceptional topics which do not need replication in all of the
> > > > namespace
> > > > > > clusters. This helps in saving network bandwidth, storage, CPU,
> RAM
> > > etc
> > > > > >
> > > > > > But the reason why you've raised this PIP is to bring down the
> > actual
> > > > > > replication semantics at a topic level. Yes, namespace level
> still
> > > > exists
> > > > > > as per your PIP as well, but is basically left only to be a
> > "default
> > > in
> > > > > > case topic level is missing".
> > > > > > This brings me to a very basic question - What's the use case
> that
> > > you
> > > > > are
> > > > > > trying to solve that needs these changes? Because, then what's
> > > stopping
> > > > > us
> > > > > > from bringing every construct that's at a namespace level
> > (bundling,
> > > > > > hardware affinity, etc) down to a topic level?
> > > > > >
> > > > > > Regards
> > > > > >
> > > > > > On Tue, Dec 5, 2023 at 2:52 PM Xiangying Meng <
> > xiangying@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Joe,
> > > > > > >
> > > > > > > You're correct. The initial design of the replication policy
> > leaves
> > > > > room
> > > > > > > for improvement. To address this, we aim to refine the cluster
> > > > settings
> > > > > > at
> > > > > > > the namespace level in a way that won't impact the existing
> > system.
> > > > The
> > > > > > > replication clusters should solely be used to establish full
> mesh
> > > > > > > replication for that specific namespace, without having any
> other
> > > > > > > definitions or functionalities.
> > > > > > >
> > > > > > > BR,
> > > > > > > Xiangying
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Dec 4, 2023 at 1:52 PM Joe F <jo...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > >if users want to change the replication policy for
> > > > > > > > topic-n and do not change the replication policy of other
> > topics,
> > > > > they
> > > > > > > need
> > > > > > > > to change all the topic policy under this namespace.
> > > > > > > >
> > > > > > > > This PIP unfortunately  flows from  attempting to solve bad
> > > > > application
> > > > > > > > design
> > > > > > > >
> > > > > > > > A namespace is supposed to represent an application, and the
> > > > > namespace
> > > > > > > > policy is an umbrella for a similar set of policies  that
> > applies
> > > > to
> > > > > > all
> > > > > > > > topics.  The exceptions would be if a topic had a need for a
> > > > deficit,
> > > > > > The
> > > > > > > > case of one topic in the namespace sticking out of the
> > namespace
> > > > > policy
> > > > > > > > umbrella is bad  application design in my opinion
> > > > > > > >
> > > > > > > > -Joe.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng <
> > > > xiangying@apache.org>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Rajan and Girish,
> > > > > > > > > Thanks for your reply. About the question you mentioned,
> > there
> > > is
> > > > > > some
> > > > > > > > > information I want to share with you.
> > > > > > > > > >If anyone wants to setup different replication clusters
> then
> > > > > either
> > > > > > > > > >those topics can be created under different namespaces or
> > > > defined
> > > > > at
> > > > > > > > topic
> > > > > > > > > >level policy.
> > > > > > > > >
> > > > > > > > > >And users can anyway go and update the namespace's cluster
> > > list
> > > > to
> > > > > > add
> > > > > > > > the
> > > > > > > > > >missing cluster.
> > > > > > > > > Because the replication clusters also mean the clusters
> where
> > > the
> > > > > > topic
> > > > > > > > can
> > > > > > > > > be created or loaded, the topic-level replication clusters
> > can
> > > > only
> > > > > > be
> > > > > > > > the
> > > > > > > > > subset of namespace-level replication clusters.
> > > > > > > > > Just as Girish mentioned, the users can update the
> > namespace's
> > > > > > cluster
> > > > > > > > list
> > > > > > > > > to add the missing cluster.
> > > > > > > > > But there is a problem because the replication clusters as
> > the
> > > > > > > namespace
> > > > > > > > > level will create a full mesh replication for that
> namespace
> > > > across
> > > > > > the
> > > > > > > > > clusters defined in
> > > > > > > > > replication-clusters if users want to change the
> replication
> > > > policy
> > > > > > for
> > > > > > > > > topic-n and do not change the replication policy of other
> > > topics,
> > > > > > they
> > > > > > > > need
> > > > > > > > > to change all the topic policy under this namespace.
> > > > > > > > >
> > > > > > > > > > Pulsar is being used by many legacy systems and changing
> > its
> > > > > > > > > >semantics for specific usecases without considering
> > > consequences
> > > > > are
> > > > > > > > > >creating a lot of pain and incompatibility problems for
> > other
> > > > > > existing
> > > > > > > > > >systems and let's avoid doing it as we are struggling with
> > > such
> > > > > > > changes
> > > > > > > > > and
> > > > > > > > > >breaking compatibility or changing semantics are just not
> > > > > > acceptable.
> > > > > > > > >
> > > > > > > > > This proposal will not introduce an incompatibility
> problem,
> > > > > because
> > > > > > > the
> > > > > > > > > default value of the namespace policy of allowed-clusters
> and
> > > > > > > > > topic-policy-synchronized-clusters are the
> > > replication-clusters.
> > > > > > > > >
> > > > > > > > > >Allowed clusters defined at tenant level
> > > > > > > > > >will restrict tenants to create namespaces in
> > regions/clusters
> > > > > where
> > > > > > > > they
> > > > > > > > > >are not allowed.
> > > > > > > > > >As Rajan also mentioned, allowed-clusters field has a
> > > different
> > > > > > > > > meaning/purpose.
> > > > > > > > >
> > > > > > > > > Allowed clusters defined at the tenant level will restrict
> > > > tenants
> > > > > > from
> > > > > > > > > creating namespaces in regions/clusters where they are not
> > > > allowed.
> > > > > > > > > Similarly, the allowed clusters defined at the namespace
> > level
> > > > will
> > > > > > > > > restrict the namespace from creating topics in
> > regions/clusters
> > > > > where
> > > > > > > > they
> > > > > > > > > are not allowed.
> > > > > > > > > What's wrong with this?
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Xiangying
> > > > > > > > >
> > > > > > > > > On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma <
> > > > > > scrapmachines@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Xiangying,
> > > > > > > > > >
> > > > > > > > > > Shouldn't the solution to the issue mentioned in #21564
> [0]
> > > > > mostly
> > > > > > be
> > > > > > > > > > around validating that topic level replication clusters
> are
> > > > > subset
> > > > > > of
> > > > > > > > > > namespace level replication clusters?
> > > > > > > > > > It would be a completely compatible change as even today
> > the
> > > > case
> > > > > > > > where a
> > > > > > > > > > topic has a cluster not defined in namespaces's
> > > > > > replication-clusters
> > > > > > > > > > doesn't really work.
> > > > > > > > > > And users can anyway go and update the namespace's
> cluster
> > > list
> > > > > to
> > > > > > > add
> > > > > > > > > the
> > > > > > > > > > missing cluster.
> > > > > > > > > >
> > > > > > > > > > As Rajan also mentioned, allowed-clusters field has a
> > > different
> > > > > > > > > > meaning/purpose.
> > > > > > > > > > Regards
> > > > > > > > > >
> > > > > > > > > > On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng <
> > > > > > > xiangying@apache.org>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi, Pulsar Community
> > > > > > > > > > >
> > > > > > > > > > > I drafted a proposal to make the configuration of
> > clusters
> > > at
> > > > > the
> > > > > > > > > > namespace
> > > > > > > > > > > level clearer. This helps solve the problem of
> > > > geo-replication
> > > > > > not
> > > > > > > > > > working
> > > > > > > > > > > correctly at the topic level.
> > > > > > > > > > >
> > > > > > > > > > > https://github.com/apache/pulsar/pull/21648
> > > > > > > > > > >
> > > > > > > > > > > I'm looking forward to hearing from you.
> > > > > > > > > > >
> > > > > > > > > > > BR
> > > > > > > > > > > Xiangying
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Girish Sharma
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Girish Sharma
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Girish Sharma
> > >
> >
>
>
> --
> Girish Sharma
>

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by Girish Sharma <sc...@gmail.com>.
Hello Xiangying,


On Thu, Dec 7, 2023 at 6:32 AM Xiangying Meng <xi...@apache.org> wrote:

> Hi Girish,
>
> What you are actually opposing is the implementation of true topic-level
> geo-replication. You believe that topics should be divided into different
> namespaces based on replication. Following this line of thought, what we
> should do is restrict the current topic-level replication settings, not
> allowing the replication clusters set at the topic level to exceed the
> range of replication clusters set in the namespace.
>

Yes, that's my viewpoint. In case that's not your view point, then in your
use cases do you ever have more than one namespace inside a tenant?
With every property coming at topic level, it makes no sense for the
namespace hierarchy to exist anymore.


>
> One point that confuses me is that we provide a setting for topic-level
> replication clusters, but it can only be used to amend the namespace
> settings and cannot work independently. Isn't this also a poor design for
> Pulsar?
>

This feature was originally added in pulsar without a PIP. And the PR [0]
also doesn't have much context around why it was needed and why it is being
added.. So I can't comment on why this was added..
But my understanding is that even in a situation when the topics are
divided into proper namespaces based on use cases and suddenly there is an
exceptional need for one of the existing topics to have lesser replication,
then instead of following a long exercise of moving that topic to a new
namespace, you can use this feature.

[0] - https://github.com/apache/pulsar/pull/12136



>
> On Thu, Dec 7, 2023 at 2:28 AM Girish Sharma <sc...@gmail.com>
> wrote:
>
> > Hello, replies inline.
> >
> > On Wed, Dec 6, 2023 at 5:28 PM Xiangying Meng <xi...@apache.org>
> > wrote:
> >
> > > Hi Girish,
> > >
> > > Thank you for your explanation. Because Joe's email referenced the
> > current
> > > implementation of Pulsar, I misunderstood him to be saying that this
> > > current implementation is not good.
> > >
> > > A possible use case is where there is one or a small number of topics
> in
> > > the namespace that store important messages, which need to be
> replicated
> > to
> > > other clusters. Meanwhile, other topics only need to store data in the
> > > local cluster.
> > >
> >
> > Is it not possible to simply have the other topics in a namespace which
> > allows for that other cluster, and the local topics remain in the
> namespace
> > with local cluster needs. Seems to me like a proper use case of two
> > different namespaces as the use case is different in both cases.
> >
> >
> >
> > >
> > > For example, only topic1 needs replication, while topic2 to topic100 do
> > > not. According to the current implementation, we need to set
> replication
> > > clusters at the namespace level (e.g. cluster1 and cluster2), and then
> > set
> > > the topic-level replication clusters (cluster1) for topic2 to topic100
> to
> > > exclude them. It's hard to say that this is a good design.
> > >
> >
> > No, all you need is to put topic1 in namespace1 and topic2 to topic100 in
> > namespace2 . This is exactly what me and Joe were saying is a bad design
> > choice that you are clubbing all 100 topics in same namespace.
> >
> >
> >
> > >
> > > Best regards.
> > >
> > > On Wed, Dec 6, 2023 at 12:49 PM Joe F <jo...@gmail.com> wrote:
> > >
> > > > Girish,
> > > >
> > > > Thank you for making my point much better than I did ..
> > > >
> > > > -Joe
> > > >
> > > > On Tue, Dec 5, 2023 at 1:45 AM Girish Sharma <
> scrapmachines@gmail.com>
> > > > wrote:
> > > >
> > > > > Hello Xiangying,
> > > > >
> > > > > I believe what Joe here is referring to as "application design" is
> > not
> > > > the
> > > > > design of pulsar or namespace level replication but the design of
> > your
> > > > > application and the dependency that you've put on topic level
> > > > replication.
> > > > >
> > > > > In general, I am aligned with Joe from an application design
> > > standpoint.
> > > > A
> > > > > namespace is supposed to represent a single application use case,
> > topic
> > > > > level override of replication clusters helps in cases where there
> > are a
> > > > few
> > > > > exceptional topics which do not need replication in all of the
> > > namespace
> > > > > clusters. This helps in saving network bandwidth, storage, CPU, RAM
> > etc
> > > > >
> > > > > But the reason why you've raised this PIP is to bring down the
> actual
> > > > > replication semantics at a topic level. Yes, namespace level still
> > > exists
> > > > > as per your PIP as well, but is basically left only to be a
> "default
> > in
> > > > > case topic level is missing".
> > > > > This brings me to a very basic question - What's the use case that
> > you
> > > > are
> > > > > trying to solve that needs these changes? Because, then what's
> > stopping
> > > > us
> > > > > from bringing every construct that's at a namespace level
> (bundling,
> > > > > hardware affinity, etc) down to a topic level?
> > > > >
> > > > > Regards
> > > > >
> > > > > On Tue, Dec 5, 2023 at 2:52 PM Xiangying Meng <
> xiangying@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Hi Joe,
> > > > > >
> > > > > > You're correct. The initial design of the replication policy
> leaves
> > > > room
> > > > > > for improvement. To address this, we aim to refine the cluster
> > > settings
> > > > > at
> > > > > > the namespace level in a way that won't impact the existing
> system.
> > > The
> > > > > > replication clusters should solely be used to establish full mesh
> > > > > > replication for that specific namespace, without having any other
> > > > > > definitions or functionalities.
> > > > > >
> > > > > > BR,
> > > > > > Xiangying
> > > > > >
> > > > > >
> > > > > > On Mon, Dec 4, 2023 at 1:52 PM Joe F <jo...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > >if users want to change the replication policy for
> > > > > > > topic-n and do not change the replication policy of other
> topics,
> > > > they
> > > > > > need
> > > > > > > to change all the topic policy under this namespace.
> > > > > > >
> > > > > > > This PIP unfortunately  flows from  attempting to solve bad
> > > > application
> > > > > > > design
> > > > > > >
> > > > > > > A namespace is supposed to represent an application, and the
> > > > namespace
> > > > > > > policy is an umbrella for a similar set of policies  that
> applies
> > > to
> > > > > all
> > > > > > > topics.  The exceptions would be if a topic had a need for a
> > > deficit,
> > > > > The
> > > > > > > case of one topic in the namespace sticking out of the
> namespace
> > > > policy
> > > > > > > umbrella is bad  application design in my opinion
> > > > > > >
> > > > > > > -Joe.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng <
> > > xiangying@apache.org>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Rajan and Girish,
> > > > > > > > Thanks for your reply. About the question you mentioned,
> there
> > is
> > > > > some
> > > > > > > > information I want to share with you.
> > > > > > > > >If anyone wants to setup different replication clusters then
> > > > either
> > > > > > > > >those topics can be created under different namespaces or
> > > defined
> > > > at
> > > > > > > topic
> > > > > > > > >level policy.
> > > > > > > >
> > > > > > > > >And users can anyway go and update the namespace's cluster
> > list
> > > to
> > > > > add
> > > > > > > the
> > > > > > > > >missing cluster.
> > > > > > > > Because the replication clusters also mean the clusters where
> > the
> > > > > topic
> > > > > > > can
> > > > > > > > be created or loaded, the topic-level replication clusters
> can
> > > only
> > > > > be
> > > > > > > the
> > > > > > > > subset of namespace-level replication clusters.
> > > > > > > > Just as Girish mentioned, the users can update the
> namespace's
> > > > > cluster
> > > > > > > list
> > > > > > > > to add the missing cluster.
> > > > > > > > But there is a problem because the replication clusters as
> the
> > > > > > namespace
> > > > > > > > level will create a full mesh replication for that namespace
> > > across
> > > > > the
> > > > > > > > clusters defined in
> > > > > > > > replication-clusters if users want to change the replication
> > > policy
> > > > > for
> > > > > > > > topic-n and do not change the replication policy of other
> > topics,
> > > > > they
> > > > > > > need
> > > > > > > > to change all the topic policy under this namespace.
> > > > > > > >
> > > > > > > > > Pulsar is being used by many legacy systems and changing
> its
> > > > > > > > >semantics for specific usecases without considering
> > consequences
> > > > are
> > > > > > > > >creating a lot of pain and incompatibility problems for
> other
> > > > > existing
> > > > > > > > >systems and let's avoid doing it as we are struggling with
> > such
> > > > > > changes
> > > > > > > > and
> > > > > > > > >breaking compatibility or changing semantics are just not
> > > > > acceptable.
> > > > > > > >
> > > > > > > > This proposal will not introduce an incompatibility problem,
> > > > because
> > > > > > the
> > > > > > > > default value of the namespace policy of allowed-clusters and
> > > > > > > > topic-policy-synchronized-clusters are the
> > replication-clusters.
> > > > > > > >
> > > > > > > > >Allowed clusters defined at tenant level
> > > > > > > > >will restrict tenants to create namespaces in
> regions/clusters
> > > > where
> > > > > > > they
> > > > > > > > >are not allowed.
> > > > > > > > >As Rajan also mentioned, allowed-clusters field has a
> > different
> > > > > > > > meaning/purpose.
> > > > > > > >
> > > > > > > > Allowed clusters defined at the tenant level will restrict
> > > tenants
> > > > > from
> > > > > > > > creating namespaces in regions/clusters where they are not
> > > allowed.
> > > > > > > > Similarly, the allowed clusters defined at the namespace
> level
> > > will
> > > > > > > > restrict the namespace from creating topics in
> regions/clusters
> > > > where
> > > > > > > they
> > > > > > > > are not allowed.
> > > > > > > > What's wrong with this?
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Xiangying
> > > > > > > >
> > > > > > > > On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma <
> > > > > scrapmachines@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Xiangying,
> > > > > > > > >
> > > > > > > > > Shouldn't the solution to the issue mentioned in #21564 [0]
> > > > mostly
> > > > > be
> > > > > > > > > around validating that topic level replication clusters are
> > > > subset
> > > > > of
> > > > > > > > > namespace level replication clusters?
> > > > > > > > > It would be a completely compatible change as even today
> the
> > > case
> > > > > > > where a
> > > > > > > > > topic has a cluster not defined in namespaces's
> > > > > replication-clusters
> > > > > > > > > doesn't really work.
> > > > > > > > > And users can anyway go and update the namespace's cluster
> > list
> > > > to
> > > > > > add
> > > > > > > > the
> > > > > > > > > missing cluster.
> > > > > > > > >
> > > > > > > > > As Rajan also mentioned, allowed-clusters field has a
> > different
> > > > > > > > > meaning/purpose.
> > > > > > > > > Regards
> > > > > > > > >
> > > > > > > > > On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng <
> > > > > > xiangying@apache.org>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi, Pulsar Community
> > > > > > > > > >
> > > > > > > > > > I drafted a proposal to make the configuration of
> clusters
> > at
> > > > the
> > > > > > > > > namespace
> > > > > > > > > > level clearer. This helps solve the problem of
> > > geo-replication
> > > > > not
> > > > > > > > > working
> > > > > > > > > > correctly at the topic level.
> > > > > > > > > >
> > > > > > > > > > https://github.com/apache/pulsar/pull/21648
> > > > > > > > > >
> > > > > > > > > > I'm looking forward to hearing from you.
> > > > > > > > > >
> > > > > > > > > > BR
> > > > > > > > > > Xiangying
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Girish Sharma
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Girish Sharma
> > > > >
> > > >
> > >
> >
> >
> > --
> > Girish Sharma
> >
>


-- 
Girish Sharma

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by Xiangying Meng <xi...@apache.org>.
Hi Girish,

What you are actually opposing is the implementation of true topic-level
geo-replication. You believe that topics should be divided into different
namespaces based on replication. Following this line of thought, what we
should do is restrict the current topic-level replication settings, not
allowing the replication clusters set at the topic level to exceed the
range of replication clusters set in the namespace.

One point that confuses me is that we provide a setting for topic-level
replication clusters, but it can only be used to amend the namespace
settings and cannot work independently. Isn't this also a poor design for
Pulsar?

On Thu, Dec 7, 2023 at 2:28 AM Girish Sharma <sc...@gmail.com>
wrote:

> Hello, replies inline.
>
> On Wed, Dec 6, 2023 at 5:28 PM Xiangying Meng <xi...@apache.org>
> wrote:
>
> > Hi Girish,
> >
> > Thank you for your explanation. Because Joe's email referenced the
> current
> > implementation of Pulsar, I misunderstood him to be saying that this
> > current implementation is not good.
> >
> > A possible use case is where there is one or a small number of topics in
> > the namespace that store important messages, which need to be replicated
> to
> > other clusters. Meanwhile, other topics only need to store data in the
> > local cluster.
> >
>
> Is it not possible to simply have the other topics in a namespace which
> allows for that other cluster, and the local topics remain in the namespace
> with local cluster needs. Seems to me like a proper use case of two
> different namespaces as the use case is different in both cases.
>
>
>
> >
> > For example, only topic1 needs replication, while topic2 to topic100 do
> > not. According to the current implementation, we need to set replication
> > clusters at the namespace level (e.g. cluster1 and cluster2), and then
> set
> > the topic-level replication clusters (cluster1) for topic2 to topic100 to
> > exclude them. It's hard to say that this is a good design.
> >
>
> No, all you need is to put topic1 in namespace1 and topic2 to topic100 in
> namespace2 . This is exactly what me and Joe were saying is a bad design
> choice that you are clubbing all 100 topics in same namespace.
>
>
>
> >
> > Best regards.
> >
> > On Wed, Dec 6, 2023 at 12:49 PM Joe F <jo...@gmail.com> wrote:
> >
> > > Girish,
> > >
> > > Thank you for making my point much better than I did ..
> > >
> > > -Joe
> > >
> > > On Tue, Dec 5, 2023 at 1:45 AM Girish Sharma <sc...@gmail.com>
> > > wrote:
> > >
> > > > Hello Xiangying,
> > > >
> > > > I believe what Joe here is referring to as "application design" is
> not
> > > the
> > > > design of pulsar or namespace level replication but the design of
> your
> > > > application and the dependency that you've put on topic level
> > > replication.
> > > >
> > > > In general, I am aligned with Joe from an application design
> > standpoint.
> > > A
> > > > namespace is supposed to represent a single application use case,
> topic
> > > > level override of replication clusters helps in cases where there
> are a
> > > few
> > > > exceptional topics which do not need replication in all of the
> > namespace
> > > > clusters. This helps in saving network bandwidth, storage, CPU, RAM
> etc
> > > >
> > > > But the reason why you've raised this PIP is to bring down the actual
> > > > replication semantics at a topic level. Yes, namespace level still
> > exists
> > > > as per your PIP as well, but is basically left only to be a "default
> in
> > > > case topic level is missing".
> > > > This brings me to a very basic question - What's the use case that
> you
> > > are
> > > > trying to solve that needs these changes? Because, then what's
> stopping
> > > us
> > > > from bringing every construct that's at a namespace level (bundling,
> > > > hardware affinity, etc) down to a topic level?
> > > >
> > > > Regards
> > > >
> > > > On Tue, Dec 5, 2023 at 2:52 PM Xiangying Meng <xi...@apache.org>
> > > > wrote:
> > > >
> > > > > Hi Joe,
> > > > >
> > > > > You're correct. The initial design of the replication policy leaves
> > > room
> > > > > for improvement. To address this, we aim to refine the cluster
> > settings
> > > > at
> > > > > the namespace level in a way that won't impact the existing system.
> > The
> > > > > replication clusters should solely be used to establish full mesh
> > > > > replication for that specific namespace, without having any other
> > > > > definitions or functionalities.
> > > > >
> > > > > BR,
> > > > > Xiangying
> > > > >
> > > > >
> > > > > On Mon, Dec 4, 2023 at 1:52 PM Joe F <jo...@gmail.com>
> wrote:
> > > > >
> > > > > > >if users want to change the replication policy for
> > > > > > topic-n and do not change the replication policy of other topics,
> > > they
> > > > > need
> > > > > > to change all the topic policy under this namespace.
> > > > > >
> > > > > > This PIP unfortunately  flows from  attempting to solve bad
> > > application
> > > > > > design
> > > > > >
> > > > > > A namespace is supposed to represent an application, and the
> > > namespace
> > > > > > policy is an umbrella for a similar set of policies  that applies
> > to
> > > > all
> > > > > > topics.  The exceptions would be if a topic had a need for a
> > deficit,
> > > > The
> > > > > > case of one topic in the namespace sticking out of the namespace
> > > policy
> > > > > > umbrella is bad  application design in my opinion
> > > > > >
> > > > > > -Joe.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng <
> > xiangying@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Rajan and Girish,
> > > > > > > Thanks for your reply. About the question you mentioned, there
> is
> > > > some
> > > > > > > information I want to share with you.
> > > > > > > >If anyone wants to setup different replication clusters then
> > > either
> > > > > > > >those topics can be created under different namespaces or
> > defined
> > > at
> > > > > > topic
> > > > > > > >level policy.
> > > > > > >
> > > > > > > >And users can anyway go and update the namespace's cluster
> list
> > to
> > > > add
> > > > > > the
> > > > > > > >missing cluster.
> > > > > > > Because the replication clusters also mean the clusters where
> the
> > > > topic
> > > > > > can
> > > > > > > be created or loaded, the topic-level replication clusters can
> > only
> > > > be
> > > > > > the
> > > > > > > subset of namespace-level replication clusters.
> > > > > > > Just as Girish mentioned, the users can update the namespace's
> > > > cluster
> > > > > > list
> > > > > > > to add the missing cluster.
> > > > > > > But there is a problem because the replication clusters as the
> > > > > namespace
> > > > > > > level will create a full mesh replication for that namespace
> > across
> > > > the
> > > > > > > clusters defined in
> > > > > > > replication-clusters if users want to change the replication
> > policy
> > > > for
> > > > > > > topic-n and do not change the replication policy of other
> topics,
> > > > they
> > > > > > need
> > > > > > > to change all the topic policy under this namespace.
> > > > > > >
> > > > > > > > Pulsar is being used by many legacy systems and changing its
> > > > > > > >semantics for specific usecases without considering
> consequences
> > > are
> > > > > > > >creating a lot of pain and incompatibility problems for other
> > > > existing
> > > > > > > >systems and let's avoid doing it as we are struggling with
> such
> > > > > changes
> > > > > > > and
> > > > > > > >breaking compatibility or changing semantics are just not
> > > > acceptable.
> > > > > > >
> > > > > > > This proposal will not introduce an incompatibility problem,
> > > because
> > > > > the
> > > > > > > default value of the namespace policy of allowed-clusters and
> > > > > > > topic-policy-synchronized-clusters are the
> replication-clusters.
> > > > > > >
> > > > > > > >Allowed clusters defined at tenant level
> > > > > > > >will restrict tenants to create namespaces in regions/clusters
> > > where
> > > > > > they
> > > > > > > >are not allowed.
> > > > > > > >As Rajan also mentioned, allowed-clusters field has a
> different
> > > > > > > meaning/purpose.
> > > > > > >
> > > > > > > Allowed clusters defined at the tenant level will restrict
> > tenants
> > > > from
> > > > > > > creating namespaces in regions/clusters where they are not
> > allowed.
> > > > > > > Similarly, the allowed clusters defined at the namespace level
> > will
> > > > > > > restrict the namespace from creating topics in regions/clusters
> > > where
> > > > > > they
> > > > > > > are not allowed.
> > > > > > > What's wrong with this?
> > > > > > >
> > > > > > > Regards,
> > > > > > > Xiangying
> > > > > > >
> > > > > > > On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma <
> > > > scrapmachines@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Xiangying,
> > > > > > > >
> > > > > > > > Shouldn't the solution to the issue mentioned in #21564 [0]
> > > mostly
> > > > be
> > > > > > > > around validating that topic level replication clusters are
> > > subset
> > > > of
> > > > > > > > namespace level replication clusters?
> > > > > > > > It would be a completely compatible change as even today the
> > case
> > > > > > where a
> > > > > > > > topic has a cluster not defined in namespaces's
> > > > replication-clusters
> > > > > > > > doesn't really work.
> > > > > > > > And users can anyway go and update the namespace's cluster
> list
> > > to
> > > > > add
> > > > > > > the
> > > > > > > > missing cluster.
> > > > > > > >
> > > > > > > > As Rajan also mentioned, allowed-clusters field has a
> different
> > > > > > > > meaning/purpose.
> > > > > > > > Regards
> > > > > > > >
> > > > > > > > On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng <
> > > > > xiangying@apache.org>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi, Pulsar Community
> > > > > > > > >
> > > > > > > > > I drafted a proposal to make the configuration of clusters
> at
> > > the
> > > > > > > > namespace
> > > > > > > > > level clearer. This helps solve the problem of
> > geo-replication
> > > > not
> > > > > > > > working
> > > > > > > > > correctly at the topic level.
> > > > > > > > >
> > > > > > > > > https://github.com/apache/pulsar/pull/21648
> > > > > > > > >
> > > > > > > > > I'm looking forward to hearing from you.
> > > > > > > > >
> > > > > > > > > BR
> > > > > > > > > Xiangying
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Girish Sharma
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Girish Sharma
> > > >
> > >
> >
>
>
> --
> Girish Sharma
>

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by Girish Sharma <sc...@gmail.com>.
Hello, replies inline.

On Wed, Dec 6, 2023 at 5:28 PM Xiangying Meng <xi...@apache.org> wrote:

> Hi Girish,
>
> Thank you for your explanation. Because Joe's email referenced the current
> implementation of Pulsar, I misunderstood him to be saying that this
> current implementation is not good.
>
> A possible use case is where there is one or a small number of topics in
> the namespace that store important messages, which need to be replicated to
> other clusters. Meanwhile, other topics only need to store data in the
> local cluster.
>

Is it not possible to simply have the other topics in a namespace which
allows for that other cluster, and the local topics remain in the namespace
with local cluster needs. Seems to me like a proper use case of two
different namespaces as the use case is different in both cases.



>
> For example, only topic1 needs replication, while topic2 to topic100 do
> not. According to the current implementation, we need to set replication
> clusters at the namespace level (e.g. cluster1 and cluster2), and then set
> the topic-level replication clusters (cluster1) for topic2 to topic100 to
> exclude them. It's hard to say that this is a good design.
>

No, all you need is to put topic1 in namespace1 and topic2 to topic100 in
namespace2 . This is exactly what me and Joe were saying is a bad design
choice that you are clubbing all 100 topics in same namespace.



>
> Best regards.
>
> On Wed, Dec 6, 2023 at 12:49 PM Joe F <jo...@gmail.com> wrote:
>
> > Girish,
> >
> > Thank you for making my point much better than I did ..
> >
> > -Joe
> >
> > On Tue, Dec 5, 2023 at 1:45 AM Girish Sharma <sc...@gmail.com>
> > wrote:
> >
> > > Hello Xiangying,
> > >
> > > I believe what Joe here is referring to as "application design" is not
> > the
> > > design of pulsar or namespace level replication but the design of your
> > > application and the dependency that you've put on topic level
> > replication.
> > >
> > > In general, I am aligned with Joe from an application design
> standpoint.
> > A
> > > namespace is supposed to represent a single application use case, topic
> > > level override of replication clusters helps in cases where there are a
> > few
> > > exceptional topics which do not need replication in all of the
> namespace
> > > clusters. This helps in saving network bandwidth, storage, CPU, RAM etc
> > >
> > > But the reason why you've raised this PIP is to bring down the actual
> > > replication semantics at a topic level. Yes, namespace level still
> exists
> > > as per your PIP as well, but is basically left only to be a "default in
> > > case topic level is missing".
> > > This brings me to a very basic question - What's the use case that you
> > are
> > > trying to solve that needs these changes? Because, then what's stopping
> > us
> > > from bringing every construct that's at a namespace level (bundling,
> > > hardware affinity, etc) down to a topic level?
> > >
> > > Regards
> > >
> > > On Tue, Dec 5, 2023 at 2:52 PM Xiangying Meng <xi...@apache.org>
> > > wrote:
> > >
> > > > Hi Joe,
> > > >
> > > > You're correct. The initial design of the replication policy leaves
> > room
> > > > for improvement. To address this, we aim to refine the cluster
> settings
> > > at
> > > > the namespace level in a way that won't impact the existing system.
> The
> > > > replication clusters should solely be used to establish full mesh
> > > > replication for that specific namespace, without having any other
> > > > definitions or functionalities.
> > > >
> > > > BR,
> > > > Xiangying
> > > >
> > > >
> > > > On Mon, Dec 4, 2023 at 1:52 PM Joe F <jo...@gmail.com> wrote:
> > > >
> > > > > >if users want to change the replication policy for
> > > > > topic-n and do not change the replication policy of other topics,
> > they
> > > > need
> > > > > to change all the topic policy under this namespace.
> > > > >
> > > > > This PIP unfortunately  flows from  attempting to solve bad
> > application
> > > > > design
> > > > >
> > > > > A namespace is supposed to represent an application, and the
> > namespace
> > > > > policy is an umbrella for a similar set of policies  that applies
> to
> > > all
> > > > > topics.  The exceptions would be if a topic had a need for a
> deficit,
> > > The
> > > > > case of one topic in the namespace sticking out of the namespace
> > policy
> > > > > umbrella is bad  application design in my opinion
> > > > >
> > > > > -Joe.
> > > > >
> > > > >
> > > > >
> > > > > On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng <
> xiangying@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Hi Rajan and Girish,
> > > > > > Thanks for your reply. About the question you mentioned, there is
> > > some
> > > > > > information I want to share with you.
> > > > > > >If anyone wants to setup different replication clusters then
> > either
> > > > > > >those topics can be created under different namespaces or
> defined
> > at
> > > > > topic
> > > > > > >level policy.
> > > > > >
> > > > > > >And users can anyway go and update the namespace's cluster list
> to
> > > add
> > > > > the
> > > > > > >missing cluster.
> > > > > > Because the replication clusters also mean the clusters where the
> > > topic
> > > > > can
> > > > > > be created or loaded, the topic-level replication clusters can
> only
> > > be
> > > > > the
> > > > > > subset of namespace-level replication clusters.
> > > > > > Just as Girish mentioned, the users can update the namespace's
> > > cluster
> > > > > list
> > > > > > to add the missing cluster.
> > > > > > But there is a problem because the replication clusters as the
> > > > namespace
> > > > > > level will create a full mesh replication for that namespace
> across
> > > the
> > > > > > clusters defined in
> > > > > > replication-clusters if users want to change the replication
> policy
> > > for
> > > > > > topic-n and do not change the replication policy of other topics,
> > > they
> > > > > need
> > > > > > to change all the topic policy under this namespace.
> > > > > >
> > > > > > > Pulsar is being used by many legacy systems and changing its
> > > > > > >semantics for specific usecases without considering consequences
> > are
> > > > > > >creating a lot of pain and incompatibility problems for other
> > > existing
> > > > > > >systems and let's avoid doing it as we are struggling with such
> > > > changes
> > > > > > and
> > > > > > >breaking compatibility or changing semantics are just not
> > > acceptable.
> > > > > >
> > > > > > This proposal will not introduce an incompatibility problem,
> > because
> > > > the
> > > > > > default value of the namespace policy of allowed-clusters and
> > > > > > topic-policy-synchronized-clusters are the replication-clusters.
> > > > > >
> > > > > > >Allowed clusters defined at tenant level
> > > > > > >will restrict tenants to create namespaces in regions/clusters
> > where
> > > > > they
> > > > > > >are not allowed.
> > > > > > >As Rajan also mentioned, allowed-clusters field has a different
> > > > > > meaning/purpose.
> > > > > >
> > > > > > Allowed clusters defined at the tenant level will restrict
> tenants
> > > from
> > > > > > creating namespaces in regions/clusters where they are not
> allowed.
> > > > > > Similarly, the allowed clusters defined at the namespace level
> will
> > > > > > restrict the namespace from creating topics in regions/clusters
> > where
> > > > > they
> > > > > > are not allowed.
> > > > > > What's wrong with this?
> > > > > >
> > > > > > Regards,
> > > > > > Xiangying
> > > > > >
> > > > > > On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma <
> > > scrapmachines@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Xiangying,
> > > > > > >
> > > > > > > Shouldn't the solution to the issue mentioned in #21564 [0]
> > mostly
> > > be
> > > > > > > around validating that topic level replication clusters are
> > subset
> > > of
> > > > > > > namespace level replication clusters?
> > > > > > > It would be a completely compatible change as even today the
> case
> > > > > where a
> > > > > > > topic has a cluster not defined in namespaces's
> > > replication-clusters
> > > > > > > doesn't really work.
> > > > > > > And users can anyway go and update the namespace's cluster list
> > to
> > > > add
> > > > > > the
> > > > > > > missing cluster.
> > > > > > >
> > > > > > > As Rajan also mentioned, allowed-clusters field has a different
> > > > > > > meaning/purpose.
> > > > > > > Regards
> > > > > > >
> > > > > > > On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng <
> > > > xiangying@apache.org>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi, Pulsar Community
> > > > > > > >
> > > > > > > > I drafted a proposal to make the configuration of clusters at
> > the
> > > > > > > namespace
> > > > > > > > level clearer. This helps solve the problem of
> geo-replication
> > > not
> > > > > > > working
> > > > > > > > correctly at the topic level.
> > > > > > > >
> > > > > > > > https://github.com/apache/pulsar/pull/21648
> > > > > > > >
> > > > > > > > I'm looking forward to hearing from you.
> > > > > > > >
> > > > > > > > BR
> > > > > > > > Xiangying
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Girish Sharma
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Girish Sharma
> > >
> >
>


-- 
Girish Sharma

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by Xiangying Meng <xi...@apache.org>.
Hi Girish,

>But the reason why you've raised this PIP is to bring down the actual
replication semantics at a topic level. Yes, namespace level still exists
as per your PIP as well, but is basically left only to be a "default in
case topic level is missing".

I'm afraid there's some misunderstanding here. According to the Pulsar
website, replication can actually be enabled at the topic level.

>You can enable geo-replication at namespace or topic level. [1]

So, it's not this proposal that introduces topic-level replication
semantics. Prior to this, topic-level replication was constrained by the
namespace-level replication policy. There are just some problems here. If
replication was not configured at the namespace level, then topic-level
replication would also be ineffective. Moreover, users would not be aware
of this replication failure.

> Yes, namespace level still exists as per your PIP as well, but is
basically left only to be a "default in case topic level is missing".

This behavior is consistent with the current behavior of Pulsar and is not
something introduced by this proposal.
This proposal introduces an `allowed-cluster` configuration at the
namespace level.
As the website states, you can enable replication at either the namespace
or topic level.
But If you only enable replication at the topic level, the replication
configuration would not take effect prior to this proposal.

Before this proposal: even though the topic policy can be updated
successfully, topic1 cannot be created in cluster2.
```
Namespace policy {replication clusters -> local cluster(cluster1)}, topic1
policy {replication clusters {cluster1, cluster2}}
```
After this proposal: you can set allowed clusters at the namespace level,
which specifies the clusters where topics under this namespace are allowed.
Then, the topic-level replication would also be effective, as described on
the Pulsar website.
```
Namespace policy {replication clusters -> local cluster(cluster1), allowed
clusters -> {cluster1, cluster2, cluster3}}, topic1 policy {replication
clusters {cluster1, cluster2}}
```

[1]
https://pulsar.apache.org/docs/3.1.x/administration-geo/#enable-geo-replication


On Wed, Dec 6, 2023 at 7:57 PM Xiangying Meng <xi...@apache.org> wrote:

> Hi Girish,
>
> Thank you for your explanation. Because Joe's email referenced the current
> implementation of Pulsar, I misunderstood him to be saying that this
> current implementation is not good.
>
> A possible use case is where there is one or a small number of topics in
> the namespace that store important messages, which need to be replicated to
> other clusters. Meanwhile, other topics only need to store data in the
> local cluster.
>
> For example, only topic1 needs replication, while topic2 to topic100 do
> not. According to the current implementation, we need to set replication
> clusters at the namespace level (e.g. cluster1 and cluster2), and then set
> the topic-level replication clusters (cluster1) for topic2 to topic100 to
> exclude them. It's hard to say that this is a good design.
>
> Best regards.
>
> On Wed, Dec 6, 2023 at 12:49 PM Joe F <jo...@gmail.com> wrote:
>
>> Girish,
>>
>> Thank you for making my point much better than I did ..
>>
>> -Joe
>>
>> On Tue, Dec 5, 2023 at 1:45 AM Girish Sharma <sc...@gmail.com>
>> wrote:
>>
>> > Hello Xiangying,
>> >
>> > I believe what Joe here is referring to as "application design" is not
>> the
>> > design of pulsar or namespace level replication but the design of your
>> > application and the dependency that you've put on topic level
>> replication.
>> >
>> > In general, I am aligned with Joe from an application design
>> standpoint. A
>> > namespace is supposed to represent a single application use case, topic
>> > level override of replication clusters helps in cases where there are a
>> few
>> > exceptional topics which do not need replication in all of the namespace
>> > clusters. This helps in saving network bandwidth, storage, CPU, RAM etc
>> >
>> > But the reason why you've raised this PIP is to bring down the actual
>> > replication semantics at a topic level. Yes, namespace level still
>> exists
>> > as per your PIP as well, but is basically left only to be a "default in
>> > case topic level is missing".
>> > This brings me to a very basic question - What's the use case that you
>> are
>> > trying to solve that needs these changes? Because, then what's stopping
>> us
>> > from bringing every construct that's at a namespace level (bundling,
>> > hardware affinity, etc) down to a topic level?
>> >
>> > Regards
>> >
>> > On Tue, Dec 5, 2023 at 2:52 PM Xiangying Meng <xi...@apache.org>
>> > wrote:
>> >
>> > > Hi Joe,
>> > >
>> > > You're correct. The initial design of the replication policy leaves
>> room
>> > > for improvement. To address this, we aim to refine the cluster
>> settings
>> > at
>> > > the namespace level in a way that won't impact the existing system.
>> The
>> > > replication clusters should solely be used to establish full mesh
>> > > replication for that specific namespace, without having any other
>> > > definitions or functionalities.
>> > >
>> > > BR,
>> > > Xiangying
>> > >
>> > >
>> > > On Mon, Dec 4, 2023 at 1:52 PM Joe F <jo...@gmail.com> wrote:
>> > >
>> > > > >if users want to change the replication policy for
>> > > > topic-n and do not change the replication policy of other topics,
>> they
>> > > need
>> > > > to change all the topic policy under this namespace.
>> > > >
>> > > > This PIP unfortunately  flows from  attempting to solve bad
>> application
>> > > > design
>> > > >
>> > > > A namespace is supposed to represent an application, and the
>> namespace
>> > > > policy is an umbrella for a similar set of policies  that applies to
>> > all
>> > > > topics.  The exceptions would be if a topic had a need for a
>> deficit,
>> > The
>> > > > case of one topic in the namespace sticking out of the namespace
>> policy
>> > > > umbrella is bad  application design in my opinion
>> > > >
>> > > > -Joe.
>> > > >
>> > > >
>> > > >
>> > > > On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng <xiangying@apache.org
>> >
>> > > > wrote:
>> > > >
>> > > > > Hi Rajan and Girish,
>> > > > > Thanks for your reply. About the question you mentioned, there is
>> > some
>> > > > > information I want to share with you.
>> > > > > >If anyone wants to setup different replication clusters then
>> either
>> > > > > >those topics can be created under different namespaces or
>> defined at
>> > > > topic
>> > > > > >level policy.
>> > > > >
>> > > > > >And users can anyway go and update the namespace's cluster list
>> to
>> > add
>> > > > the
>> > > > > >missing cluster.
>> > > > > Because the replication clusters also mean the clusters where the
>> > topic
>> > > > can
>> > > > > be created or loaded, the topic-level replication clusters can
>> only
>> > be
>> > > > the
>> > > > > subset of namespace-level replication clusters.
>> > > > > Just as Girish mentioned, the users can update the namespace's
>> > cluster
>> > > > list
>> > > > > to add the missing cluster.
>> > > > > But there is a problem because the replication clusters as the
>> > > namespace
>> > > > > level will create a full mesh replication for that namespace
>> across
>> > the
>> > > > > clusters defined in
>> > > > > replication-clusters if users want to change the replication
>> policy
>> > for
>> > > > > topic-n and do not change the replication policy of other topics,
>> > they
>> > > > need
>> > > > > to change all the topic policy under this namespace.
>> > > > >
>> > > > > > Pulsar is being used by many legacy systems and changing its
>> > > > > >semantics for specific usecases without considering consequences
>> are
>> > > > > >creating a lot of pain and incompatibility problems for other
>> > existing
>> > > > > >systems and let's avoid doing it as we are struggling with such
>> > > changes
>> > > > > and
>> > > > > >breaking compatibility or changing semantics are just not
>> > acceptable.
>> > > > >
>> > > > > This proposal will not introduce an incompatibility problem,
>> because
>> > > the
>> > > > > default value of the namespace policy of allowed-clusters and
>> > > > > topic-policy-synchronized-clusters are the replication-clusters.
>> > > > >
>> > > > > >Allowed clusters defined at tenant level
>> > > > > >will restrict tenants to create namespaces in regions/clusters
>> where
>> > > > they
>> > > > > >are not allowed.
>> > > > > >As Rajan also mentioned, allowed-clusters field has a different
>> > > > > meaning/purpose.
>> > > > >
>> > > > > Allowed clusters defined at the tenant level will restrict tenants
>> > from
>> > > > > creating namespaces in regions/clusters where they are not
>> allowed.
>> > > > > Similarly, the allowed clusters defined at the namespace level
>> will
>> > > > > restrict the namespace from creating topics in regions/clusters
>> where
>> > > > they
>> > > > > are not allowed.
>> > > > > What's wrong with this?
>> > > > >
>> > > > > Regards,
>> > > > > Xiangying
>> > > > >
>> > > > > On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma <
>> > scrapmachines@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > > > Hi Xiangying,
>> > > > > >
>> > > > > > Shouldn't the solution to the issue mentioned in #21564 [0]
>> mostly
>> > be
>> > > > > > around validating that topic level replication clusters are
>> subset
>> > of
>> > > > > > namespace level replication clusters?
>> > > > > > It would be a completely compatible change as even today the
>> case
>> > > > where a
>> > > > > > topic has a cluster not defined in namespaces's
>> > replication-clusters
>> > > > > > doesn't really work.
>> > > > > > And users can anyway go and update the namespace's cluster list
>> to
>> > > add
>> > > > > the
>> > > > > > missing cluster.
>> > > > > >
>> > > > > > As Rajan also mentioned, allowed-clusters field has a different
>> > > > > > meaning/purpose.
>> > > > > > Regards
>> > > > > >
>> > > > > > On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng <
>> > > xiangying@apache.org>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Hi, Pulsar Community
>> > > > > > >
>> > > > > > > I drafted a proposal to make the configuration of clusters at
>> the
>> > > > > > namespace
>> > > > > > > level clearer. This helps solve the problem of geo-replication
>> > not
>> > > > > > working
>> > > > > > > correctly at the topic level.
>> > > > > > >
>> > > > > > > https://github.com/apache/pulsar/pull/21648
>> > > > > > >
>> > > > > > > I'm looking forward to hearing from you.
>> > > > > > >
>> > > > > > > BR
>> > > > > > > Xiangying
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Girish Sharma
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> >
>> > --
>> > Girish Sharma
>> >
>>
>

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by Xiangying Meng <xi...@apache.org>.
Hi Girish,

Thank you for your explanation. Because Joe's email referenced the current
implementation of Pulsar, I misunderstood him to be saying that this
current implementation is not good.

A possible use case is where there is one or a small number of topics in
the namespace that store important messages, which need to be replicated to
other clusters. Meanwhile, other topics only need to store data in the
local cluster.

For example, only topic1 needs replication, while topic2 to topic100 do
not. According to the current implementation, we need to set replication
clusters at the namespace level (e.g. cluster1 and cluster2), and then set
the topic-level replication clusters (cluster1) for topic2 to topic100 to
exclude them. It's hard to say that this is a good design.

Best regards.

On Wed, Dec 6, 2023 at 12:49 PM Joe F <jo...@gmail.com> wrote:

> Girish,
>
> Thank you for making my point much better than I did ..
>
> -Joe
>
> On Tue, Dec 5, 2023 at 1:45 AM Girish Sharma <sc...@gmail.com>
> wrote:
>
> > Hello Xiangying,
> >
> > I believe what Joe here is referring to as "application design" is not
> the
> > design of pulsar or namespace level replication but the design of your
> > application and the dependency that you've put on topic level
> replication.
> >
> > In general, I am aligned with Joe from an application design standpoint.
> A
> > namespace is supposed to represent a single application use case, topic
> > level override of replication clusters helps in cases where there are a
> few
> > exceptional topics which do not need replication in all of the namespace
> > clusters. This helps in saving network bandwidth, storage, CPU, RAM etc
> >
> > But the reason why you've raised this PIP is to bring down the actual
> > replication semantics at a topic level. Yes, namespace level still exists
> > as per your PIP as well, but is basically left only to be a "default in
> > case topic level is missing".
> > This brings me to a very basic question - What's the use case that you
> are
> > trying to solve that needs these changes? Because, then what's stopping
> us
> > from bringing every construct that's at a namespace level (bundling,
> > hardware affinity, etc) down to a topic level?
> >
> > Regards
> >
> > On Tue, Dec 5, 2023 at 2:52 PM Xiangying Meng <xi...@apache.org>
> > wrote:
> >
> > > Hi Joe,
> > >
> > > You're correct. The initial design of the replication policy leaves
> room
> > > for improvement. To address this, we aim to refine the cluster settings
> > at
> > > the namespace level in a way that won't impact the existing system. The
> > > replication clusters should solely be used to establish full mesh
> > > replication for that specific namespace, without having any other
> > > definitions or functionalities.
> > >
> > > BR,
> > > Xiangying
> > >
> > >
> > > On Mon, Dec 4, 2023 at 1:52 PM Joe F <jo...@gmail.com> wrote:
> > >
> > > > >if users want to change the replication policy for
> > > > topic-n and do not change the replication policy of other topics,
> they
> > > need
> > > > to change all the topic policy under this namespace.
> > > >
> > > > This PIP unfortunately  flows from  attempting to solve bad
> application
> > > > design
> > > >
> > > > A namespace is supposed to represent an application, and the
> namespace
> > > > policy is an umbrella for a similar set of policies  that applies to
> > all
> > > > topics.  The exceptions would be if a topic had a need for a deficit,
> > The
> > > > case of one topic in the namespace sticking out of the namespace
> policy
> > > > umbrella is bad  application design in my opinion
> > > >
> > > > -Joe.
> > > >
> > > >
> > > >
> > > > On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng <xi...@apache.org>
> > > > wrote:
> > > >
> > > > > Hi Rajan and Girish,
> > > > > Thanks for your reply. About the question you mentioned, there is
> > some
> > > > > information I want to share with you.
> > > > > >If anyone wants to setup different replication clusters then
> either
> > > > > >those topics can be created under different namespaces or defined
> at
> > > > topic
> > > > > >level policy.
> > > > >
> > > > > >And users can anyway go and update the namespace's cluster list to
> > add
> > > > the
> > > > > >missing cluster.
> > > > > Because the replication clusters also mean the clusters where the
> > topic
> > > > can
> > > > > be created or loaded, the topic-level replication clusters can only
> > be
> > > > the
> > > > > subset of namespace-level replication clusters.
> > > > > Just as Girish mentioned, the users can update the namespace's
> > cluster
> > > > list
> > > > > to add the missing cluster.
> > > > > But there is a problem because the replication clusters as the
> > > namespace
> > > > > level will create a full mesh replication for that namespace across
> > the
> > > > > clusters defined in
> > > > > replication-clusters if users want to change the replication policy
> > for
> > > > > topic-n and do not change the replication policy of other topics,
> > they
> > > > need
> > > > > to change all the topic policy under this namespace.
> > > > >
> > > > > > Pulsar is being used by many legacy systems and changing its
> > > > > >semantics for specific usecases without considering consequences
> are
> > > > > >creating a lot of pain and incompatibility problems for other
> > existing
> > > > > >systems and let's avoid doing it as we are struggling with such
> > > changes
> > > > > and
> > > > > >breaking compatibility or changing semantics are just not
> > acceptable.
> > > > >
> > > > > This proposal will not introduce an incompatibility problem,
> because
> > > the
> > > > > default value of the namespace policy of allowed-clusters and
> > > > > topic-policy-synchronized-clusters are the replication-clusters.
> > > > >
> > > > > >Allowed clusters defined at tenant level
> > > > > >will restrict tenants to create namespaces in regions/clusters
> where
> > > > they
> > > > > >are not allowed.
> > > > > >As Rajan also mentioned, allowed-clusters field has a different
> > > > > meaning/purpose.
> > > > >
> > > > > Allowed clusters defined at the tenant level will restrict tenants
> > from
> > > > > creating namespaces in regions/clusters where they are not allowed.
> > > > > Similarly, the allowed clusters defined at the namespace level will
> > > > > restrict the namespace from creating topics in regions/clusters
> where
> > > > they
> > > > > are not allowed.
> > > > > What's wrong with this?
> > > > >
> > > > > Regards,
> > > > > Xiangying
> > > > >
> > > > > On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma <
> > scrapmachines@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Xiangying,
> > > > > >
> > > > > > Shouldn't the solution to the issue mentioned in #21564 [0]
> mostly
> > be
> > > > > > around validating that topic level replication clusters are
> subset
> > of
> > > > > > namespace level replication clusters?
> > > > > > It would be a completely compatible change as even today the case
> > > > where a
> > > > > > topic has a cluster not defined in namespaces's
> > replication-clusters
> > > > > > doesn't really work.
> > > > > > And users can anyway go and update the namespace's cluster list
> to
> > > add
> > > > > the
> > > > > > missing cluster.
> > > > > >
> > > > > > As Rajan also mentioned, allowed-clusters field has a different
> > > > > > meaning/purpose.
> > > > > > Regards
> > > > > >
> > > > > > On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng <
> > > xiangying@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi, Pulsar Community
> > > > > > >
> > > > > > > I drafted a proposal to make the configuration of clusters at
> the
> > > > > > namespace
> > > > > > > level clearer. This helps solve the problem of geo-replication
> > not
> > > > > > working
> > > > > > > correctly at the topic level.
> > > > > > >
> > > > > > > https://github.com/apache/pulsar/pull/21648
> > > > > > >
> > > > > > > I'm looking forward to hearing from you.
> > > > > > >
> > > > > > > BR
> > > > > > > Xiangying
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Girish Sharma
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> > Girish Sharma
> >
>

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by Joe F <jo...@gmail.com>.
Girish,

Thank you for making my point much better than I did ..

-Joe

On Tue, Dec 5, 2023 at 1:45 AM Girish Sharma <sc...@gmail.com>
wrote:

> Hello Xiangying,
>
> I believe what Joe here is referring to as "application design" is not the
> design of pulsar or namespace level replication but the design of your
> application and the dependency that you've put on topic level replication.
>
> In general, I am aligned with Joe from an application design standpoint. A
> namespace is supposed to represent a single application use case, topic
> level override of replication clusters helps in cases where there are a few
> exceptional topics which do not need replication in all of the namespace
> clusters. This helps in saving network bandwidth, storage, CPU, RAM etc
>
> But the reason why you've raised this PIP is to bring down the actual
> replication semantics at a topic level. Yes, namespace level still exists
> as per your PIP as well, but is basically left only to be a "default in
> case topic level is missing".
> This brings me to a very basic question - What's the use case that you are
> trying to solve that needs these changes? Because, then what's stopping us
> from bringing every construct that's at a namespace level (bundling,
> hardware affinity, etc) down to a topic level?
>
> Regards
>
> On Tue, Dec 5, 2023 at 2:52 PM Xiangying Meng <xi...@apache.org>
> wrote:
>
> > Hi Joe,
> >
> > You're correct. The initial design of the replication policy leaves room
> > for improvement. To address this, we aim to refine the cluster settings
> at
> > the namespace level in a way that won't impact the existing system. The
> > replication clusters should solely be used to establish full mesh
> > replication for that specific namespace, without having any other
> > definitions or functionalities.
> >
> > BR,
> > Xiangying
> >
> >
> > On Mon, Dec 4, 2023 at 1:52 PM Joe F <jo...@gmail.com> wrote:
> >
> > > >if users want to change the replication policy for
> > > topic-n and do not change the replication policy of other topics, they
> > need
> > > to change all the topic policy under this namespace.
> > >
> > > This PIP unfortunately  flows from  attempting to solve bad application
> > > design
> > >
> > > A namespace is supposed to represent an application, and the namespace
> > > policy is an umbrella for a similar set of policies  that applies to
> all
> > > topics.  The exceptions would be if a topic had a need for a deficit,
> The
> > > case of one topic in the namespace sticking out of the namespace policy
> > > umbrella is bad  application design in my opinion
> > >
> > > -Joe.
> > >
> > >
> > >
> > > On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng <xi...@apache.org>
> > > wrote:
> > >
> > > > Hi Rajan and Girish,
> > > > Thanks for your reply. About the question you mentioned, there is
> some
> > > > information I want to share with you.
> > > > >If anyone wants to setup different replication clusters then either
> > > > >those topics can be created under different namespaces or defined at
> > > topic
> > > > >level policy.
> > > >
> > > > >And users can anyway go and update the namespace's cluster list to
> add
> > > the
> > > > >missing cluster.
> > > > Because the replication clusters also mean the clusters where the
> topic
> > > can
> > > > be created or loaded, the topic-level replication clusters can only
> be
> > > the
> > > > subset of namespace-level replication clusters.
> > > > Just as Girish mentioned, the users can update the namespace's
> cluster
> > > list
> > > > to add the missing cluster.
> > > > But there is a problem because the replication clusters as the
> > namespace
> > > > level will create a full mesh replication for that namespace across
> the
> > > > clusters defined in
> > > > replication-clusters if users want to change the replication policy
> for
> > > > topic-n and do not change the replication policy of other topics,
> they
> > > need
> > > > to change all the topic policy under this namespace.
> > > >
> > > > > Pulsar is being used by many legacy systems and changing its
> > > > >semantics for specific usecases without considering consequences are
> > > > >creating a lot of pain and incompatibility problems for other
> existing
> > > > >systems and let's avoid doing it as we are struggling with such
> > changes
> > > > and
> > > > >breaking compatibility or changing semantics are just not
> acceptable.
> > > >
> > > > This proposal will not introduce an incompatibility problem, because
> > the
> > > > default value of the namespace policy of allowed-clusters and
> > > > topic-policy-synchronized-clusters are the replication-clusters.
> > > >
> > > > >Allowed clusters defined at tenant level
> > > > >will restrict tenants to create namespaces in regions/clusters where
> > > they
> > > > >are not allowed.
> > > > >As Rajan also mentioned, allowed-clusters field has a different
> > > > meaning/purpose.
> > > >
> > > > Allowed clusters defined at the tenant level will restrict tenants
> from
> > > > creating namespaces in regions/clusters where they are not allowed.
> > > > Similarly, the allowed clusters defined at the namespace level will
> > > > restrict the namespace from creating topics in regions/clusters where
> > > they
> > > > are not allowed.
> > > > What's wrong with this?
> > > >
> > > > Regards,
> > > > Xiangying
> > > >
> > > > On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma <
> scrapmachines@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Xiangying,
> > > > >
> > > > > Shouldn't the solution to the issue mentioned in #21564 [0] mostly
> be
> > > > > around validating that topic level replication clusters are subset
> of
> > > > > namespace level replication clusters?
> > > > > It would be a completely compatible change as even today the case
> > > where a
> > > > > topic has a cluster not defined in namespaces's
> replication-clusters
> > > > > doesn't really work.
> > > > > And users can anyway go and update the namespace's cluster list to
> > add
> > > > the
> > > > > missing cluster.
> > > > >
> > > > > As Rajan also mentioned, allowed-clusters field has a different
> > > > > meaning/purpose.
> > > > > Regards
> > > > >
> > > > > On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng <
> > xiangying@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Hi, Pulsar Community
> > > > > >
> > > > > > I drafted a proposal to make the configuration of clusters at the
> > > > > namespace
> > > > > > level clearer. This helps solve the problem of geo-replication
> not
> > > > > working
> > > > > > correctly at the topic level.
> > > > > >
> > > > > > https://github.com/apache/pulsar/pull/21648
> > > > > >
> > > > > > I'm looking forward to hearing from you.
> > > > > >
> > > > > > BR
> > > > > > Xiangying
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Girish Sharma
> > > > >
> > > >
> > >
> >
>
>
> --
> Girish Sharma
>

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by Girish Sharma <sc...@gmail.com>.
Hello Xiangying,

I believe what Joe here is referring to as "application design" is not the
design of pulsar or namespace level replication but the design of your
application and the dependency that you've put on topic level replication.

In general, I am aligned with Joe from an application design standpoint. A
namespace is supposed to represent a single application use case, topic
level override of replication clusters helps in cases where there are a few
exceptional topics which do not need replication in all of the namespace
clusters. This helps in saving network bandwidth, storage, CPU, RAM etc

But the reason why you've raised this PIP is to bring down the actual
replication semantics at a topic level. Yes, namespace level still exists
as per your PIP as well, but is basically left only to be a "default in
case topic level is missing".
This brings me to a very basic question - What's the use case that you are
trying to solve that needs these changes? Because, then what's stopping us
from bringing every construct that's at a namespace level (bundling,
hardware affinity, etc) down to a topic level?

Regards

On Tue, Dec 5, 2023 at 2:52 PM Xiangying Meng <xi...@apache.org> wrote:

> Hi Joe,
>
> You're correct. The initial design of the replication policy leaves room
> for improvement. To address this, we aim to refine the cluster settings at
> the namespace level in a way that won't impact the existing system. The
> replication clusters should solely be used to establish full mesh
> replication for that specific namespace, without having any other
> definitions or functionalities.
>
> BR,
> Xiangying
>
>
> On Mon, Dec 4, 2023 at 1:52 PM Joe F <jo...@gmail.com> wrote:
>
> > >if users want to change the replication policy for
> > topic-n and do not change the replication policy of other topics, they
> need
> > to change all the topic policy under this namespace.
> >
> > This PIP unfortunately  flows from  attempting to solve bad application
> > design
> >
> > A namespace is supposed to represent an application, and the namespace
> > policy is an umbrella for a similar set of policies  that applies to all
> > topics.  The exceptions would be if a topic had a need for a deficit, The
> > case of one topic in the namespace sticking out of the namespace policy
> > umbrella is bad  application design in my opinion
> >
> > -Joe.
> >
> >
> >
> > On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng <xi...@apache.org>
> > wrote:
> >
> > > Hi Rajan and Girish,
> > > Thanks for your reply. About the question you mentioned, there is some
> > > information I want to share with you.
> > > >If anyone wants to setup different replication clusters then either
> > > >those topics can be created under different namespaces or defined at
> > topic
> > > >level policy.
> > >
> > > >And users can anyway go and update the namespace's cluster list to add
> > the
> > > >missing cluster.
> > > Because the replication clusters also mean the clusters where the topic
> > can
> > > be created or loaded, the topic-level replication clusters can only be
> > the
> > > subset of namespace-level replication clusters.
> > > Just as Girish mentioned, the users can update the namespace's cluster
> > list
> > > to add the missing cluster.
> > > But there is a problem because the replication clusters as the
> namespace
> > > level will create a full mesh replication for that namespace across the
> > > clusters defined in
> > > replication-clusters if users want to change the replication policy for
> > > topic-n and do not change the replication policy of other topics, they
> > need
> > > to change all the topic policy under this namespace.
> > >
> > > > Pulsar is being used by many legacy systems and changing its
> > > >semantics for specific usecases without considering consequences are
> > > >creating a lot of pain and incompatibility problems for other existing
> > > >systems and let's avoid doing it as we are struggling with such
> changes
> > > and
> > > >breaking compatibility or changing semantics are just not acceptable.
> > >
> > > This proposal will not introduce an incompatibility problem, because
> the
> > > default value of the namespace policy of allowed-clusters and
> > > topic-policy-synchronized-clusters are the replication-clusters.
> > >
> > > >Allowed clusters defined at tenant level
> > > >will restrict tenants to create namespaces in regions/clusters where
> > they
> > > >are not allowed.
> > > >As Rajan also mentioned, allowed-clusters field has a different
> > > meaning/purpose.
> > >
> > > Allowed clusters defined at the tenant level will restrict tenants from
> > > creating namespaces in regions/clusters where they are not allowed.
> > > Similarly, the allowed clusters defined at the namespace level will
> > > restrict the namespace from creating topics in regions/clusters where
> > they
> > > are not allowed.
> > > What's wrong with this?
> > >
> > > Regards,
> > > Xiangying
> > >
> > > On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma <sc...@gmail.com>
> > > wrote:
> > >
> > > > Hi Xiangying,
> > > >
> > > > Shouldn't the solution to the issue mentioned in #21564 [0] mostly be
> > > > around validating that topic level replication clusters are subset of
> > > > namespace level replication clusters?
> > > > It would be a completely compatible change as even today the case
> > where a
> > > > topic has a cluster not defined in namespaces's replication-clusters
> > > > doesn't really work.
> > > > And users can anyway go and update the namespace's cluster list to
> add
> > > the
> > > > missing cluster.
> > > >
> > > > As Rajan also mentioned, allowed-clusters field has a different
> > > > meaning/purpose.
> > > > Regards
> > > >
> > > > On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng <
> xiangying@apache.org>
> > > > wrote:
> > > >
> > > > > Hi, Pulsar Community
> > > > >
> > > > > I drafted a proposal to make the configuration of clusters at the
> > > > namespace
> > > > > level clearer. This helps solve the problem of geo-replication not
> > > > working
> > > > > correctly at the topic level.
> > > > >
> > > > > https://github.com/apache/pulsar/pull/21648
> > > > >
> > > > > I'm looking forward to hearing from you.
> > > > >
> > > > > BR
> > > > > Xiangying
> > > > >
> > > >
> > > >
> > > > --
> > > > Girish Sharma
> > > >
> > >
> >
>


-- 
Girish Sharma

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by Xiangying Meng <xi...@apache.org>.
Hi Joe,

You're correct. The initial design of the replication policy leaves room
for improvement. To address this, we aim to refine the cluster settings at
the namespace level in a way that won't impact the existing system. The
replication clusters should solely be used to establish full mesh
replication for that specific namespace, without having any other
definitions or functionalities.

BR,
Xiangying


On Mon, Dec 4, 2023 at 1:52 PM Joe F <jo...@gmail.com> wrote:

> >if users want to change the replication policy for
> topic-n and do not change the replication policy of other topics, they need
> to change all the topic policy under this namespace.
>
> This PIP unfortunately  flows from  attempting to solve bad application
> design
>
> A namespace is supposed to represent an application, and the namespace
> policy is an umbrella for a similar set of policies  that applies to all
> topics.  The exceptions would be if a topic had a need for a deficit, The
> case of one topic in the namespace sticking out of the namespace policy
> umbrella is bad  application design in my opinion
>
> -Joe.
>
>
>
> On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng <xi...@apache.org>
> wrote:
>
> > Hi Rajan and Girish,
> > Thanks for your reply. About the question you mentioned, there is some
> > information I want to share with you.
> > >If anyone wants to setup different replication clusters then either
> > >those topics can be created under different namespaces or defined at
> topic
> > >level policy.
> >
> > >And users can anyway go and update the namespace's cluster list to add
> the
> > >missing cluster.
> > Because the replication clusters also mean the clusters where the topic
> can
> > be created or loaded, the topic-level replication clusters can only be
> the
> > subset of namespace-level replication clusters.
> > Just as Girish mentioned, the users can update the namespace's cluster
> list
> > to add the missing cluster.
> > But there is a problem because the replication clusters as the namespace
> > level will create a full mesh replication for that namespace across the
> > clusters defined in
> > replication-clusters if users want to change the replication policy for
> > topic-n and do not change the replication policy of other topics, they
> need
> > to change all the topic policy under this namespace.
> >
> > > Pulsar is being used by many legacy systems and changing its
> > >semantics for specific usecases without considering consequences are
> > >creating a lot of pain and incompatibility problems for other existing
> > >systems and let's avoid doing it as we are struggling with such changes
> > and
> > >breaking compatibility or changing semantics are just not acceptable.
> >
> > This proposal will not introduce an incompatibility problem, because the
> > default value of the namespace policy of allowed-clusters and
> > topic-policy-synchronized-clusters are the replication-clusters.
> >
> > >Allowed clusters defined at tenant level
> > >will restrict tenants to create namespaces in regions/clusters where
> they
> > >are not allowed.
> > >As Rajan also mentioned, allowed-clusters field has a different
> > meaning/purpose.
> >
> > Allowed clusters defined at the tenant level will restrict tenants from
> > creating namespaces in regions/clusters where they are not allowed.
> > Similarly, the allowed clusters defined at the namespace level will
> > restrict the namespace from creating topics in regions/clusters where
> they
> > are not allowed.
> > What's wrong with this?
> >
> > Regards,
> > Xiangying
> >
> > On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma <sc...@gmail.com>
> > wrote:
> >
> > > Hi Xiangying,
> > >
> > > Shouldn't the solution to the issue mentioned in #21564 [0] mostly be
> > > around validating that topic level replication clusters are subset of
> > > namespace level replication clusters?
> > > It would be a completely compatible change as even today the case
> where a
> > > topic has a cluster not defined in namespaces's replication-clusters
> > > doesn't really work.
> > > And users can anyway go and update the namespace's cluster list to add
> > the
> > > missing cluster.
> > >
> > > As Rajan also mentioned, allowed-clusters field has a different
> > > meaning/purpose.
> > > Regards
> > >
> > > On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng <xi...@apache.org>
> > > wrote:
> > >
> > > > Hi, Pulsar Community
> > > >
> > > > I drafted a proposal to make the configuration of clusters at the
> > > namespace
> > > > level clearer. This helps solve the problem of geo-replication not
> > > working
> > > > correctly at the topic level.
> > > >
> > > > https://github.com/apache/pulsar/pull/21648
> > > >
> > > > I'm looking forward to hearing from you.
> > > >
> > > > BR
> > > > Xiangying
> > > >
> > >
> > >
> > > --
> > > Girish Sharma
> > >
> >
>

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by Joe F <jo...@gmail.com>.
>if users want to change the replication policy for
topic-n and do not change the replication policy of other topics, they need
to change all the topic policy under this namespace.

This PIP unfortunately  flows from  attempting to solve bad application
design

A namespace is supposed to represent an application, and the namespace
policy is an umbrella for a similar set of policies  that applies to all
topics.  The exceptions would be if a topic had a need for a deficit, The
case of one topic in the namespace sticking out of the namespace policy
umbrella is bad  application design in my opinion

-Joe.



On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng <xi...@apache.org> wrote:

> Hi Rajan and Girish,
> Thanks for your reply. About the question you mentioned, there is some
> information I want to share with you.
> >If anyone wants to setup different replication clusters then either
> >those topics can be created under different namespaces or defined at topic
> >level policy.
>
> >And users can anyway go and update the namespace's cluster list to add the
> >missing cluster.
> Because the replication clusters also mean the clusters where the topic can
> be created or loaded, the topic-level replication clusters can only be the
> subset of namespace-level replication clusters.
> Just as Girish mentioned, the users can update the namespace's cluster list
> to add the missing cluster.
> But there is a problem because the replication clusters as the namespace
> level will create a full mesh replication for that namespace across the
> clusters defined in
> replication-clusters if users want to change the replication policy for
> topic-n and do not change the replication policy of other topics, they need
> to change all the topic policy under this namespace.
>
> > Pulsar is being used by many legacy systems and changing its
> >semantics for specific usecases without considering consequences are
> >creating a lot of pain and incompatibility problems for other existing
> >systems and let's avoid doing it as we are struggling with such changes
> and
> >breaking compatibility or changing semantics are just not acceptable.
>
> This proposal will not introduce an incompatibility problem, because the
> default value of the namespace policy of allowed-clusters and
> topic-policy-synchronized-clusters are the replication-clusters.
>
> >Allowed clusters defined at tenant level
> >will restrict tenants to create namespaces in regions/clusters where they
> >are not allowed.
> >As Rajan also mentioned, allowed-clusters field has a different
> meaning/purpose.
>
> Allowed clusters defined at the tenant level will restrict tenants from
> creating namespaces in regions/clusters where they are not allowed.
> Similarly, the allowed clusters defined at the namespace level will
> restrict the namespace from creating topics in regions/clusters where they
> are not allowed.
> What's wrong with this?
>
> Regards,
> Xiangying
>
> On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma <sc...@gmail.com>
> wrote:
>
> > Hi Xiangying,
> >
> > Shouldn't the solution to the issue mentioned in #21564 [0] mostly be
> > around validating that topic level replication clusters are subset of
> > namespace level replication clusters?
> > It would be a completely compatible change as even today the case where a
> > topic has a cluster not defined in namespaces's replication-clusters
> > doesn't really work.
> > And users can anyway go and update the namespace's cluster list to add
> the
> > missing cluster.
> >
> > As Rajan also mentioned, allowed-clusters field has a different
> > meaning/purpose.
> > Regards
> >
> > On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng <xi...@apache.org>
> > wrote:
> >
> > > Hi, Pulsar Community
> > >
> > > I drafted a proposal to make the configuration of clusters at the
> > namespace
> > > level clearer. This helps solve the problem of geo-replication not
> > working
> > > correctly at the topic level.
> > >
> > > https://github.com/apache/pulsar/pull/21648
> > >
> > > I'm looking forward to hearing from you.
> > >
> > > BR
> > > Xiangying
> > >
> >
> >
> > --
> > Girish Sharma
> >
>

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by Xiangying Meng <xi...@apache.org>.
Hi Rajan and Girish,
Thanks for your reply. About the question you mentioned, there is some
information I want to share with you.
>If anyone wants to setup different replication clusters then either
>those topics can be created under different namespaces or defined at topic
>level policy.

>And users can anyway go and update the namespace's cluster list to add the
>missing cluster.
Because the replication clusters also mean the clusters where the topic can
be created or loaded, the topic-level replication clusters can only be the
subset of namespace-level replication clusters.
Just as Girish mentioned, the users can update the namespace's cluster list
to add the missing cluster.
But there is a problem because the replication clusters as the namespace
level will create a full mesh replication for that namespace across the
clusters defined in
replication-clusters if users want to change the replication policy for
topic-n and do not change the replication policy of other topics, they need
to change all the topic policy under this namespace.

> Pulsar is being used by many legacy systems and changing its
>semantics for specific usecases without considering consequences are
>creating a lot of pain and incompatibility problems for other existing
>systems and let's avoid doing it as we are struggling with such changes and
>breaking compatibility or changing semantics are just not acceptable.

This proposal will not introduce an incompatibility problem, because the
default value of the namespace policy of allowed-clusters and
topic-policy-synchronized-clusters are the replication-clusters.

>Allowed clusters defined at tenant level
>will restrict tenants to create namespaces in regions/clusters where they
>are not allowed.
>As Rajan also mentioned, allowed-clusters field has a different
meaning/purpose.

Allowed clusters defined at the tenant level will restrict tenants from
creating namespaces in regions/clusters where they are not allowed.
Similarly, the allowed clusters defined at the namespace level will
restrict the namespace from creating topics in regions/clusters where they
are not allowed.
What's wrong with this?

Regards,
Xiangying

On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma <sc...@gmail.com>
wrote:

> Hi Xiangying,
>
> Shouldn't the solution to the issue mentioned in #21564 [0] mostly be
> around validating that topic level replication clusters are subset of
> namespace level replication clusters?
> It would be a completely compatible change as even today the case where a
> topic has a cluster not defined in namespaces's replication-clusters
> doesn't really work.
> And users can anyway go and update the namespace's cluster list to add the
> missing cluster.
>
> As Rajan also mentioned, allowed-clusters field has a different
> meaning/purpose.
> Regards
>
> On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng <xi...@apache.org>
> wrote:
>
> > Hi, Pulsar Community
> >
> > I drafted a proposal to make the configuration of clusters at the
> namespace
> > level clearer. This helps solve the problem of geo-replication not
> working
> > correctly at the topic level.
> >
> > https://github.com/apache/pulsar/pull/21648
> >
> > I'm looking forward to hearing from you.
> >
> > BR
> > Xiangying
> >
>
>
> --
> Girish Sharma
>

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

Posted by Girish Sharma <sc...@gmail.com>.
Hi Xiangying,

Shouldn't the solution to the issue mentioned in #21564 [0] mostly be
around validating that topic level replication clusters are subset of
namespace level replication clusters?
It would be a completely compatible change as even today the case where a
topic has a cluster not defined in namespaces's replication-clusters
doesn't really work.
And users can anyway go and update the namespace's cluster list to add the
missing cluster.

As Rajan also mentioned, allowed-clusters field has a different
meaning/purpose.
Regards

On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng <xi...@apache.org>
wrote:

> Hi, Pulsar Community
>
> I drafted a proposal to make the configuration of clusters at the namespace
> level clearer. This helps solve the problem of geo-replication not working
> correctly at the topic level.
>
> https://github.com/apache/pulsar/pull/21648
>
> I'm looking forward to hearing from you.
>
> BR
> Xiangying
>


-- 
Girish Sharma