You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by ma...@gmail.com on 2023/02/01 23:33:35 UTC

[DISCUSS] Topic name restriction

Hi, All

In the current implementation, pulsar didn't support topic name restriction. It's a good chance to discuss it.

I think this discussion aims to identify what types of topic names we all need to restrict.

I know three topic names that need to be restricted at the moment.

1. The `-partition-` keyword.
2. Topic name characters validation.
3. System topic prefix `__`.


Please feel free to leave your comments.
I will keep this discussion for a week. If there are no more new types of restrictions, I will refine the previous PIP-242[0] to explain more details.
> If we have other restrictions behind this discussion. We can draft a new PIP to add it directly.
Thanks to Michael's opinion[1], we can expand the PIP-242 scopes to help pulsar have a good topic name restriction.

Best,
Mattison

[0] https://github.com/apache/pulsar/issues/19239
[1] https://lists.apache.org/thread/dd1kxhodjvovtb8yyojkk209st4o0ft2

Re: [DISCUSS] Topic name restriction

Posted by Yong Zhang <zh...@gmail.com>.
>
> Additional:
> Since the disallowed topic names configuration needs more discussion about
> name pattern, type etc. I think we can wait for the demand to consider it.
>

Sounds good. We can discuss this in the future with a new proposal if there
has someone who needs it.

Yong


On Sat, 11 Feb 2023 at 11:23, <ma...@gmail.com> wrote:

> Hi guys.
>
> Thanks for your discussion in this thread. Since we have reached the
> discussion deadline.
> > I will keep this discussion for a week. If there are no more new types
> of restrictions, I will refine the previous PIP-242[0] to explain more
> details.
>
> I would like to refine the PIP-242, which includes four parts.
>
> • Introduce `enableStrictTopicName.` configuration.
> • Add NamedEntity validation for the topic name.
> • Add `-partition-` keyword.
> • Introduce topic name's structure.(we can keep the original topic names
> and introduce new system topic name structure `__SYS__<name>__`. )
>
>
> Additional:
> Since the disallowed topic names configuration needs more discussion about
> name pattern, type etc. I think we can wait for the demand to consider it.
>
> Best,
> Mattison
>
> On Feb 6, 2023, 23:10 +0800, mattisonchao@gmail.com, wrote:
> > Hi, Asaf
> > > I don't understand the idea suggested of making the validation
> rulesconfigurable.If understand correctly:* "-partition" is not something
> you want configurable - it should always bevalidated* System topics - once
> we agree on a naming convention going forward, itshould always be validated.
> > We need to ensure compatibility so that users can choose. as Michael
> mentioned. the configurable restriction can easily help users to avoid
> breaking.
> > > In the context of PIP 242, we're introducing a config to
> optionallyenforce strict topic names. As such, we could rely on the config
> toeither use the "cheap" check to see if the topic starts with __ or
> wecould use the more expensive check to determine if the topic name isone
> of many possible system topic names. Because we want to maintainbackwards
> compatibility, we cannot completely get rid of the oldlogic.
> > Best,
> > Mattison
> > On Feb 5, 2023, 19:24 +0800, Asaf Mesika <as...@gmail.com>, wrote:
> > > Thanks Mattison.
> > >
> > > I don't understand the idea suggested of making the validation rules
> > > configurable.
> > > If understand correctly:
> > > * "-partition" is not something you want configurable - it should
> always be
> > > validated
> > > * System topics - once we agree on a naming convention going forward,
> it
> > > should always be validated.
> > >
> > > I'm ok with adding configuration for the user so they can validate
> rules of
> > > their own, maybe even per tenant.
> > >
> > >
> > > On Fri, Feb 3, 2023 at 11:44 AM <ma...@gmail.com> wrote:
> > >
> > > > Hi, Asaf
> > > >
> > > > We are using the regular expression to check the name.
> > > > "^[-=:.\\w]*$"
> > > > The \w means [A-Za-z0-9_] , which includes underscores.
> > > >
> > > > Best,
> > > > Mattison
> > > > On Feb 2, 2023, 23:01 +0800, Asaf Mesika <as...@gmail.com>,
> wrote:
> > > > > > NamedEntity is not allowing underscores - does it make sense?
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Feb 2, 2023 at 8:35 AM Michael Marshall <
> mmarshall@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > > Thanks for starting this thread, Mattison.
> > > > > > > >
> > > > > > > > > > > > The topic name character validation is already done
> by
> > > > > > > > > > > > `NamedEntity#checkName`.
> > > > > > > >
> > > > > > > > Based on my reading of the code, only the tenant and the
> namespace
> > > > > > > > names are validated using that method. There is a call [0]
> that looks
> > > > > > > > like it validates topic names, but that method is only
> called by
> > > > > > > > tests.
> > > > > > > >
> > > > > > > > > > > > But I have a concern that whether we should
> > > > > > > > > > > > treat all topics that start with the long underscore
> ("__") as
> > > > system
> > > > > > > > > > > > topics?
> > > > > > > >
> > > > > > > > This is a reasonable concern, and my primary motivation in
> proposing
> > > > > > > > this change is to make it easier for the broker to handle
> system
> > > > > > > > topics, which often get unique treatment.
> > > > > > > >
> > > > > > > > I wrote on this topic in several replies on this thread from
> a year ago
> > > > > > > > [1].
> > > > > > > >
> > > > > > > > In the context of PIP 242, we're introducing a config to
> optionally
> > > > > > > > enforce strict topic names. As such, we could rely on the
> config to
> > > > > > > > either use the "cheap" check to see if the topic starts with
> __ or we
> > > > > > > > could use the more expensive check to determine if the topic
> name is
> > > > > > > > one of many possible system topic names. Because we want to
> maintain
> > > > > > > > backwards compatibility, we cannot completely get rid of the
> old
> > > > > > > > logic. I like self describing names because they are elegant
> and
> > > > > > > > efficient.
> > > > > > > >
> > > > > > > > > > > > If yes, how would you like to allow users to access
> the system
> > > > topics?
> > > > > > > >
> > > > > > > > I proposed some ideas at the end of that thread [1]. We
> should have a
> > > > > > > > clear definition of system topics and how they are or are
> not accessed
> > > > by
> > > > > > > > users. Ultimately, we continue to create new system topics
> without
> > > > > > > > reserving a designated naming structure and without defining
> how these
> > > > > > > > topics ought to be interacted with, as Yunze points out.
> Note that any
> > > > > > > > system topic we introduce could conflict with existing user
> topics, so
> > > > > > > > proactively reserving a set of names makes it easier for
> forwards
> > > > > > > > compatibility.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Michael
> > > > > > > >
> > > > > > > > [0]
> > > > > > > >
> > > >
> https://github.com/apache/pulsar/blob/b880b1d240ade864181935aa360bfca03a5aa67f/pulsar-common/src/main/java/org/apache/pulsar/common/naming/NamespaceName.java#L159
> > > > > > > > [1]
> https://lists.apache.org/thread/pj4n4wzm3do8nkc52l7g7obh0sktzm17
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Feb 1, 2023 at 11:28 PM rxl@apache.org <
> > > > ranxiaolong716@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi Mattison:
> > > > > > > > > > > >
> > > > > > > > > > > > Agree with Yong's idea. We can expose `disallowed
> topic` as a
> > > > > > > > configuration
> > > > > > > > > > > > to the user side, and a more flexible way is to
> expose it as a
> > > > > > > > > > > > namespace-level policy. This can ensure that there
> is no need to do
> > > > > > > > special
> > > > > > > > > > > > processing on customized keywords in the future, and
> the expected
> > > > effect
> > > > > > > > > > > > can be achieved by modifying the configuration.
> > > > > > > > > > > >
> > > > > > > > > > > > Think Yunze's concerns are justified for the system
> topic. Is it
> > > > okay if
> > > > > > > > we
> > > > > > > > > > > > use hard code? Because the identification of any
> keyword is likely
> > > > to be
> > > > > > > > > > > > hit by the user. The hard code method is used to
> filter out system
> > > > topics
> > > > > > > > > > > > and not allow users to operate during delete and
> create operations.
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Thanks
> > > > > > > > > > > > Xiaolong Ran
> > > > > > > > > > > >
> > > > > > > > > > > > Dave Fisher <wa...@comcast.net> 于2023年2月2日周四
> 11:26写道:
> > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Sent from my iPhone
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Feb 1, 2023, at 6:52 PM, Yong
> Zhang <
> > > > zhangyong1025.zy@gmail.com>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Mattison,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I agree with you about restricting
> the topic name.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > How about using a blacklist way to
> restrict it?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > If we do then please call it by another name
> like “disallowed”.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > We can have a blacklist on the topic
> name restriction and
> > > > make it
> > > > > > > > > > > > > > > > > > > > configurable. Add the keywords you
> mentioned in the default
> > > > > > > > > > > > > > > > configuration.
> > > > > > > > > > > > > > > > > > > > That would have a more general way
> to block a topic name
> > > > creation.
> > > > > > > > > > > > > > > > > > > > If we have more restrictions on the
> topic name in the
> > > > future, this
> > > > > > > > way
> > > > > > > > > > > > > > > > > > > > can make it easy to fit them without
> changing any code.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Is there anyone asking for this feature?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > Dave
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > Yong
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >> On Thu, 2 Feb 2023 at 07:33, <
> mattisonchao@gmail.com>
> > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> Hi, All
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> In the current
> implementation, pulsar didn't support
> > > > topic name
> > > > > > > > > > > > > > > > > > > > > > >> restriction. It's a good
> chance to discuss it.
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> I think this discussion aims
> to identify what types of
> > > > topic names
> > > > > > > > we
> > > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > > > > > >> need to restrict.
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> I know three topic names that
> need to be restricted at
> > > > the moment.
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> 1. The `-partition-` keyword.
> > > > > > > > > > > > > > > > > > > > > > >> 2. Topic name characters
> validation.
> > > > > > > > > > > > > > > > > > > > > > >> 3. System topic prefix `__`.
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> Please feel free to leave
> your comments.
> > > > > > > > > > > > > > > > > > > > > > >> I will keep this discussion
> for a week. If there are no
> > > > more new
> > > > > > > > types
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > > >> restrictions, I will refine
> the previous PIP-242[0] to
> > > > explain more
> > > > > > > > > > > > > > > > details.
> > > > > > > > > > > > > > > > > > > > > > > > > >>> If we have other
> restrictions behind this
> > > > discussion. We can draft
> > > > > > > > a
> > > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > > > > > >> PIP to add it directly.
> > > > > > > > > > > > > > > > > > > > > > >> Thanks to Michael's
> opinion[1], we can expand the
> > > > PIP-242 scopes to
> > > > > > > > help
> > > > > > > > > > > > > > > > > > > > > > >> pulsar have a good topic name
> restriction.
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> Best,
> > > > > > > > > > > > > > > > > > > > > > >> Mattison
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> [0]
> https://github.com/apache/pulsar/issues/19239
> > > > > > > > > > > > > > > > > > > > > > >> [1]
> > > > > > > >
> https://lists.apache.org/thread/dd1kxhodjvovtb8yyojkk209st4o0ft2
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > >
> > > >
>

Re: [DISCUSS] Topic name restriction

Posted by ma...@gmail.com.
Hi guys.

Thanks for your discussion in this thread. Since we have reached the discussion deadline.
> I will keep this discussion for a week. If there are no more new types of restrictions, I will refine the previous PIP-242[0] to explain more details.

I would like to refine the PIP-242, which includes four parts.

• Introduce `enableStrictTopicName.` configuration.
• Add NamedEntity validation for the topic name.
• Add `-partition-` keyword.
• Introduce topic name's structure.(we can keep the original topic names and introduce new system topic name structure `__SYS__<name>__`. )


Additional:
Since the disallowed topic names configuration needs more discussion about name pattern, type etc. I think we can wait for the demand to consider it.

Best,
Mattison

On Feb 6, 2023, 23:10 +0800, mattisonchao@gmail.com, wrote:
> Hi, Asaf
> > I don't understand the idea suggested of making the validation rulesconfigurable.If understand correctly:* "-partition" is not something you want configurable - it should always bevalidated* System topics - once we agree on a naming convention going forward, itshould always be validated.
> We need to ensure compatibility so that users can choose. as Michael mentioned. the configurable restriction can easily help users to avoid breaking.
> > In the context of PIP 242, we're introducing a config to optionallyenforce strict topic names. As such, we could rely on the config toeither use the "cheap" check to see if the topic starts with __ or wecould use the more expensive check to determine if the topic name isone of many possible system topic names. Because we want to maintainbackwards compatibility, we cannot completely get rid of the oldlogic.
> Best,
> Mattison
> On Feb 5, 2023, 19:24 +0800, Asaf Mesika <as...@gmail.com>, wrote:
> > Thanks Mattison.
> >
> > I don't understand the idea suggested of making the validation rules
> > configurable.
> > If understand correctly:
> > * "-partition" is not something you want configurable - it should always be
> > validated
> > * System topics - once we agree on a naming convention going forward, it
> > should always be validated.
> >
> > I'm ok with adding configuration for the user so they can validate rules of
> > their own, maybe even per tenant.
> >
> >
> > On Fri, Feb 3, 2023 at 11:44 AM <ma...@gmail.com> wrote:
> >
> > > Hi, Asaf
> > >
> > > We are using the regular expression to check the name.
> > > "^[-=:.\\w]*$"
> > > The \w means [A-Za-z0-9_] , which includes underscores.
> > >
> > > Best,
> > > Mattison
> > > On Feb 2, 2023, 23:01 +0800, Asaf Mesika <as...@gmail.com>, wrote:
> > > > > NamedEntity is not allowing underscores - does it make sense?
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Feb 2, 2023 at 8:35 AM Michael Marshall <mm...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > > Thanks for starting this thread, Mattison.
> > > > > > >
> > > > > > > > > > > The topic name character validation is already done by
> > > > > > > > > > > `NamedEntity#checkName`.
> > > > > > >
> > > > > > > Based on my reading of the code, only the tenant and the namespace
> > > > > > > names are validated using that method. There is a call [0] that looks
> > > > > > > like it validates topic names, but that method is only called by
> > > > > > > tests.
> > > > > > >
> > > > > > > > > > > But I have a concern that whether we should
> > > > > > > > > > > treat all topics that start with the long underscore ("__") as
> > > system
> > > > > > > > > > > topics?
> > > > > > >
> > > > > > > This is a reasonable concern, and my primary motivation in proposing
> > > > > > > this change is to make it easier for the broker to handle system
> > > > > > > topics, which often get unique treatment.
> > > > > > >
> > > > > > > I wrote on this topic in several replies on this thread from a year ago
> > > > > > > [1].
> > > > > > >
> > > > > > > In the context of PIP 242, we're introducing a config to optionally
> > > > > > > enforce strict topic names. As such, we could rely on the config to
> > > > > > > either use the "cheap" check to see if the topic starts with __ or we
> > > > > > > could use the more expensive check to determine if the topic name is
> > > > > > > one of many possible system topic names. Because we want to maintain
> > > > > > > backwards compatibility, we cannot completely get rid of the old
> > > > > > > logic. I like self describing names because they are elegant and
> > > > > > > efficient.
> > > > > > >
> > > > > > > > > > > If yes, how would you like to allow users to access the system
> > > topics?
> > > > > > >
> > > > > > > I proposed some ideas at the end of that thread [1]. We should have a
> > > > > > > clear definition of system topics and how they are or are not accessed
> > > by
> > > > > > > users. Ultimately, we continue to create new system topics without
> > > > > > > reserving a designated naming structure and without defining how these
> > > > > > > topics ought to be interacted with, as Yunze points out. Note that any
> > > > > > > system topic we introduce could conflict with existing user topics, so
> > > > > > > proactively reserving a set of names makes it easier for forwards
> > > > > > > compatibility.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Michael
> > > > > > >
> > > > > > > [0]
> > > > > > >
> > > https://github.com/apache/pulsar/blob/b880b1d240ade864181935aa360bfca03a5aa67f/pulsar-common/src/main/java/org/apache/pulsar/common/naming/NamespaceName.java#L159
> > > > > > > [1] https://lists.apache.org/thread/pj4n4wzm3do8nkc52l7g7obh0sktzm17
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Feb 1, 2023 at 11:28 PM rxl@apache.org <
> > > ranxiaolong716@gmail.com>
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hi Mattison:
> > > > > > > > > > >
> > > > > > > > > > > Agree with Yong's idea. We can expose `disallowed topic` as a
> > > > > > > configuration
> > > > > > > > > > > to the user side, and a more flexible way is to expose it as a
> > > > > > > > > > > namespace-level policy. This can ensure that there is no need to do
> > > > > > > special
> > > > > > > > > > > processing on customized keywords in the future, and the expected
> > > effect
> > > > > > > > > > > can be achieved by modifying the configuration.
> > > > > > > > > > >
> > > > > > > > > > > Think Yunze's concerns are justified for the system topic. Is it
> > > okay if
> > > > > > > we
> > > > > > > > > > > use hard code? Because the identification of any keyword is likely
> > > to be
> > > > > > > > > > > hit by the user. The hard code method is used to filter out system
> > > topics
> > > > > > > > > > > and not allow users to operate during delete and create operations.
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Thanks
> > > > > > > > > > > Xiaolong Ran
> > > > > > > > > > >
> > > > > > > > > > > Dave Fisher <wa...@comcast.net> 于2023年2月2日周四 11:26写道:
> > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Sent from my iPhone
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Feb 1, 2023, at 6:52 PM, Yong Zhang <
> > > zhangyong1025.zy@gmail.com>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Mattison,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I agree with you about restricting the topic name.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > How about using a blacklist way to restrict it?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > If we do then please call it by another name like “disallowed”.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > We can have a blacklist on the topic name restriction and
> > > make it
> > > > > > > > > > > > > > > > > > > configurable. Add the keywords you mentioned in the default
> > > > > > > > > > > > > > > configuration.
> > > > > > > > > > > > > > > > > > > That would have a more general way to block a topic name
> > > creation.
> > > > > > > > > > > > > > > > > > > If we have more restrictions on the topic name in the
> > > future, this
> > > > > > > way
> > > > > > > > > > > > > > > > > > > can make it easy to fit them without changing any code.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Is there anyone asking for this feature?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > Dave
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > Yong
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> On Thu, 2 Feb 2023 at 07:33, <ma...@gmail.com>
> > > wrote:
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> Hi, All
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> In the current implementation, pulsar didn't support
> > > topic name
> > > > > > > > > > > > > > > > > > > > > >> restriction. It's a good chance to discuss it.
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> I think this discussion aims to identify what types of
> > > topic names
> > > > > > > we
> > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > > > > >> need to restrict.
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> I know three topic names that need to be restricted at
> > > the moment.
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> 1. The `-partition-` keyword.
> > > > > > > > > > > > > > > > > > > > > >> 2. Topic name characters validation.
> > > > > > > > > > > > > > > > > > > > > >> 3. System topic prefix `__`.
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> Please feel free to leave your comments.
> > > > > > > > > > > > > > > > > > > > > >> I will keep this discussion for a week. If there are no
> > > more new
> > > > > > > types
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > >> restrictions, I will refine the previous PIP-242[0] to
> > > explain more
> > > > > > > > > > > > > > > details.
> > > > > > > > > > > > > > > > > > > > > > > > >>> If we have other restrictions behind this
> > > discussion. We can draft
> > > > > > > a
> > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > > > > >> PIP to add it directly.
> > > > > > > > > > > > > > > > > > > > > >> Thanks to Michael's opinion[1], we can expand the
> > > PIP-242 scopes to
> > > > > > > help
> > > > > > > > > > > > > > > > > > > > > >> pulsar have a good topic name restriction.
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> Best,
> > > > > > > > > > > > > > > > > > > > > >> Mattison
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> [0] https://github.com/apache/pulsar/issues/19239
> > > > > > > > > > > > > > > > > > > > > >> [1]
> > > > > > > https://lists.apache.org/thread/dd1kxhodjvovtb8yyojkk209st4o0ft2
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > >
> > >

Re: [DISCUSS] Topic name restriction

Posted by ma...@gmail.com.
Hi, Asaf
> I don't understand the idea suggested of making the validation rulesconfigurable.If understand correctly:* "-partition" is not something you want configurable - it should always bevalidated* System topics - once we agree on a naming convention going forward, itshould always be validated.
We need to ensure compatibility so that users can choose. as Michael mentioned. the configurable restriction can easily help users to avoid breaking.
> In the context of PIP 242, we're introducing a config to optionallyenforce strict topic names. As such, we could rely on the config toeither use the "cheap" check to see if the topic starts with __ or wecould use the more expensive check to determine if the topic name isone of many possible system topic names. Because we want to maintainbackwards compatibility, we cannot completely get rid of the oldlogic.
Best,
Mattison
On Feb 5, 2023, 19:24 +0800, Asaf Mesika <as...@gmail.com>, wrote:
> Thanks Mattison.
>
> I don't understand the idea suggested of making the validation rules
> configurable.
> If understand correctly:
> * "-partition" is not something you want configurable - it should always be
> validated
> * System topics - once we agree on a naming convention going forward, it
> should always be validated.
>
> I'm ok with adding configuration for the user so they can validate rules of
> their own, maybe even per tenant.
>
>
> On Fri, Feb 3, 2023 at 11:44 AM <ma...@gmail.com> wrote:
>
> > Hi, Asaf
> >
> > We are using the regular expression to check the name.
> > "^[-=:.\\w]*$"
> > The \w means [A-Za-z0-9_] , which includes underscores.
> >
> > Best,
> > Mattison
> > On Feb 2, 2023, 23:01 +0800, Asaf Mesika <as...@gmail.com>, wrote:
> > > > NamedEntity is not allowing underscores - does it make sense?
> > > >
> > > >
> > > >
> > > > On Thu, Feb 2, 2023 at 8:35 AM Michael Marshall <mm...@apache.org>
> > > > wrote:
> > > >
> > > > > > Thanks for starting this thread, Mattison.
> > > > > >
> > > > > > > > > > The topic name character validation is already done by
> > > > > > > > > > `NamedEntity#checkName`.
> > > > > >
> > > > > > Based on my reading of the code, only the tenant and the namespace
> > > > > > names are validated using that method. There is a call [0] that looks
> > > > > > like it validates topic names, but that method is only called by
> > > > > > tests.
> > > > > >
> > > > > > > > > > But I have a concern that whether we should
> > > > > > > > > > treat all topics that start with the long underscore ("__") as
> > system
> > > > > > > > > > topics?
> > > > > >
> > > > > > This is a reasonable concern, and my primary motivation in proposing
> > > > > > this change is to make it easier for the broker to handle system
> > > > > > topics, which often get unique treatment.
> > > > > >
> > > > > > I wrote on this topic in several replies on this thread from a year ago
> > > > > > [1].
> > > > > >
> > > > > > In the context of PIP 242, we're introducing a config to optionally
> > > > > > enforce strict topic names. As such, we could rely on the config to
> > > > > > either use the "cheap" check to see if the topic starts with __ or we
> > > > > > could use the more expensive check to determine if the topic name is
> > > > > > one of many possible system topic names. Because we want to maintain
> > > > > > backwards compatibility, we cannot completely get rid of the old
> > > > > > logic. I like self describing names because they are elegant and
> > > > > > efficient.
> > > > > >
> > > > > > > > > > If yes, how would you like to allow users to access the system
> > topics?
> > > > > >
> > > > > > I proposed some ideas at the end of that thread [1]. We should have a
> > > > > > clear definition of system topics and how they are or are not accessed
> > by
> > > > > > users. Ultimately, we continue to create new system topics without
> > > > > > reserving a designated naming structure and without defining how these
> > > > > > topics ought to be interacted with, as Yunze points out. Note that any
> > > > > > system topic we introduce could conflict with existing user topics, so
> > > > > > proactively reserving a set of names makes it easier for forwards
> > > > > > compatibility.
> > > > > >
> > > > > > Thanks,
> > > > > > Michael
> > > > > >
> > > > > > [0]
> > > > > >
> > https://github.com/apache/pulsar/blob/b880b1d240ade864181935aa360bfca03a5aa67f/pulsar-common/src/main/java/org/apache/pulsar/common/naming/NamespaceName.java#L159
> > > > > > [1] https://lists.apache.org/thread/pj4n4wzm3do8nkc52l7g7obh0sktzm17
> > > > > >
> > > > > >
> > > > > > On Wed, Feb 1, 2023 at 11:28 PM rxl@apache.org <
> > ranxiaolong716@gmail.com>
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Hi Mattison:
> > > > > > > > > >
> > > > > > > > > > Agree with Yong's idea. We can expose `disallowed topic` as a
> > > > > > configuration
> > > > > > > > > > to the user side, and a more flexible way is to expose it as a
> > > > > > > > > > namespace-level policy. This can ensure that there is no need to do
> > > > > > special
> > > > > > > > > > processing on customized keywords in the future, and the expected
> > effect
> > > > > > > > > > can be achieved by modifying the configuration.
> > > > > > > > > >
> > > > > > > > > > Think Yunze's concerns are justified for the system topic. Is it
> > okay if
> > > > > > we
> > > > > > > > > > use hard code? Because the identification of any keyword is likely
> > to be
> > > > > > > > > > hit by the user. The hard code method is used to filter out system
> > topics
> > > > > > > > > > and not allow users to operate during delete and create operations.
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Thanks
> > > > > > > > > > Xiaolong Ran
> > > > > > > > > >
> > > > > > > > > > Dave Fisher <wa...@comcast.net> 于2023年2月2日周四 11:26写道:
> > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Sent from my iPhone
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Feb 1, 2023, at 6:52 PM, Yong Zhang <
> > zhangyong1025.zy@gmail.com>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Mattison,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I agree with you about restricting the topic name.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > How about using a blacklist way to restrict it?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If we do then please call it by another name like “disallowed”.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > We can have a blacklist on the topic name restriction and
> > make it
> > > > > > > > > > > > > > > > > > configurable. Add the keywords you mentioned in the default
> > > > > > > > > > > > > > configuration.
> > > > > > > > > > > > > > > > > > That would have a more general way to block a topic name
> > creation.
> > > > > > > > > > > > > > > > > > If we have more restrictions on the topic name in the
> > future, this
> > > > > > way
> > > > > > > > > > > > > > > > > > can make it easy to fit them without changing any code.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Is there anyone asking for this feature?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Dave
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > Yong
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> On Thu, 2 Feb 2023 at 07:33, <ma...@gmail.com>
> > wrote:
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> Hi, All
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> In the current implementation, pulsar didn't support
> > topic name
> > > > > > > > > > > > > > > > > > > > >> restriction. It's a good chance to discuss it.
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> I think this discussion aims to identify what types of
> > topic names
> > > > > > we
> > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > > > >> need to restrict.
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> I know three topic names that need to be restricted at
> > the moment.
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> 1. The `-partition-` keyword.
> > > > > > > > > > > > > > > > > > > > >> 2. Topic name characters validation.
> > > > > > > > > > > > > > > > > > > > >> 3. System topic prefix `__`.
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> Please feel free to leave your comments.
> > > > > > > > > > > > > > > > > > > > >> I will keep this discussion for a week. If there are no
> > more new
> > > > > > types
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >> restrictions, I will refine the previous PIP-242[0] to
> > explain more
> > > > > > > > > > > > > > details.
> > > > > > > > > > > > > > > > > > > > > > > >>> If we have other restrictions behind this
> > discussion. We can draft
> > > > > > a
> > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > > > >> PIP to add it directly.
> > > > > > > > > > > > > > > > > > > > >> Thanks to Michael's opinion[1], we can expand the
> > PIP-242 scopes to
> > > > > > help
> > > > > > > > > > > > > > > > > > > > >> pulsar have a good topic name restriction.
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> Best,
> > > > > > > > > > > > > > > > > > > > >> Mattison
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> [0] https://github.com/apache/pulsar/issues/19239
> > > > > > > > > > > > > > > > > > > > >> [1]
> > > > > > https://lists.apache.org/thread/dd1kxhodjvovtb8yyojkk209st4o0ft2
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > >
> >

Re: [DISCUSS] Topic name restriction

Posted by Asaf Mesika <as...@gmail.com>.
Thanks Mattison.

I don't understand the idea suggested of making the validation rules
configurable.
If understand correctly:
* "-partition" is not something you want configurable - it should always be
validated
* System topics - once we agree on a naming convention going forward, it
should always be validated.

I'm ok with adding configuration for the user so they can validate rules of
their own, maybe even per tenant.


On Fri, Feb 3, 2023 at 11:44 AM <ma...@gmail.com> wrote:

> Hi, Asaf
>
> We are using the regular expression to check the name.
> "^[-=:.\\w]*$"
> The \w means [A-Za-z0-9_] , which includes underscores.
>
> Best,
> Mattison
> On Feb 2, 2023, 23:01 +0800, Asaf Mesika <as...@gmail.com>, wrote:
> > NamedEntity is not allowing underscores - does it make sense?
> >
> >
> >
> > On Thu, Feb 2, 2023 at 8:35 AM Michael Marshall <mm...@apache.org>
> > wrote:
> >
> > > Thanks for starting this thread, Mattison.
> > >
> > > > > The topic name character validation is already done by
> > > > > `NamedEntity#checkName`.
> > >
> > > Based on my reading of the code, only the tenant and the namespace
> > > names are validated using that method. There is a call [0] that looks
> > > like it validates topic names, but that method is only called by
> > > tests.
> > >
> > > > > But I have a concern that whether we should
> > > > > treat all topics that start with the long underscore ("__") as
> system
> > > > > topics?
> > >
> > > This is a reasonable concern, and my primary motivation in proposing
> > > this change is to make it easier for the broker to handle system
> > > topics, which often get unique treatment.
> > >
> > > I wrote on this topic in several replies on this thread from a year ago
> > > [1].
> > >
> > > In the context of PIP 242, we're introducing a config to optionally
> > > enforce strict topic names. As such, we could rely on the config to
> > > either use the "cheap" check to see if the topic starts with __ or we
> > > could use the more expensive check to determine if the topic name is
> > > one of many possible system topic names. Because we want to maintain
> > > backwards compatibility, we cannot completely get rid of the old
> > > logic. I like self describing names because they are elegant and
> > > efficient.
> > >
> > > > > If yes, how would you like to allow users to access the system
> topics?
> > >
> > > I proposed some ideas at the end of that thread [1]. We should have a
> > > clear definition of system topics and how they are or are not accessed
> by
> > > users. Ultimately, we continue to create new system topics without
> > > reserving a designated naming structure and without defining how these
> > > topics ought to be interacted with, as Yunze points out. Note that any
> > > system topic we introduce could conflict with existing user topics, so
> > > proactively reserving a set of names makes it easier for forwards
> > > compatibility.
> > >
> > > Thanks,
> > > Michael
> > >
> > > [0]
> > >
> https://github.com/apache/pulsar/blob/b880b1d240ade864181935aa360bfca03a5aa67f/pulsar-common/src/main/java/org/apache/pulsar/common/naming/NamespaceName.java#L159
> > > [1] https://lists.apache.org/thread/pj4n4wzm3do8nkc52l7g7obh0sktzm17
> > >
> > >
> > > On Wed, Feb 1, 2023 at 11:28 PM rxl@apache.org <
> ranxiaolong716@gmail.com>
> > > wrote:
> > > > >
> > > > > Hi Mattison:
> > > > >
> > > > > Agree with Yong's idea. We can expose `disallowed topic` as a
> > > configuration
> > > > > to the user side, and a more flexible way is to expose it as a
> > > > > namespace-level policy. This can ensure that there is no need to do
> > > special
> > > > > processing on customized keywords in the future, and the expected
> effect
> > > > > can be achieved by modifying the configuration.
> > > > >
> > > > > Think Yunze's concerns are justified for the system topic. Is it
> okay if
> > > we
> > > > > use hard code? Because the identification of any keyword is likely
> to be
> > > > > hit by the user. The hard code method is used to filter out system
> topics
> > > > > and not allow users to operate during delete and create operations.
> > > > >
> > > > > --
> > > > > Thanks
> > > > > Xiaolong Ran
> > > > >
> > > > > Dave Fisher <wa...@comcast.net> 于2023年2月2日周四 11:26写道:
> > > > >
> > > > > > >
> > > > > > >
> > > > > > > Sent from my iPhone
> > > > > > >
> > > > > > > > > On Feb 1, 2023, at 6:52 PM, Yong Zhang <
> zhangyong1025.zy@gmail.com>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Mattison,
> > > > > > > > >
> > > > > > > > > I agree with you about restricting the topic name.
> > > > > > > > >
> > > > > > > > > How about using a blacklist way to restrict it?
> > > > > > >
> > > > > > > If we do then please call it by another name like “disallowed”.
> > > > > > >
> > > > > > > > >
> > > > > > > > > We can have a blacklist on the topic name restriction and
> make it
> > > > > > > > > configurable. Add the keywords you mentioned in the default
> > > > > > > configuration.
> > > > > > > > > That would have a more general way to block a topic name
> creation.
> > > > > > > > > If we have more restrictions on the topic name in the
> future, this
> > > way
> > > > > > > > > can make it easy to fit them without changing any code.
> > > > > > >
> > > > > > > Is there anyone asking for this feature?
> > > > > > >
> > > > > > > Best,
> > > > > > > Dave
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Yong
> > > > > > > > >
> > > > > > > > > >> On Thu, 2 Feb 2023 at 07:33, <ma...@gmail.com>
> wrote:
> > > > > > > > > >>
> > > > > > > > > >> Hi, All
> > > > > > > > > >>
> > > > > > > > > >> In the current implementation, pulsar didn't support
> topic name
> > > > > > > > > >> restriction. It's a good chance to discuss it.
> > > > > > > > > >>
> > > > > > > > > >> I think this discussion aims to identify what types of
> topic names
> > > we
> > > > > > > all
> > > > > > > > > >> need to restrict.
> > > > > > > > > >>
> > > > > > > > > >> I know three topic names that need to be restricted at
> the moment.
> > > > > > > > > >>
> > > > > > > > > >> 1. The `-partition-` keyword.
> > > > > > > > > >> 2. Topic name characters validation.
> > > > > > > > > >> 3. System topic prefix `__`.
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >> Please feel free to leave your comments.
> > > > > > > > > >> I will keep this discussion for a week. If there are no
> more new
> > > types
> > > > > > > of
> > > > > > > > > >> restrictions, I will refine the previous PIP-242[0] to
> explain more
> > > > > > > details.
> > > > > > > > > > >>> If we have other restrictions behind this
> discussion. We can draft
> > > a
> > > > > > > new
> > > > > > > > > >> PIP to add it directly.
> > > > > > > > > >> Thanks to Michael's opinion[1], we can expand the
> PIP-242 scopes to
> > > help
> > > > > > > > > >> pulsar have a good topic name restriction.
> > > > > > > > > >>
> > > > > > > > > >> Best,
> > > > > > > > > >> Mattison
> > > > > > > > > >>
> > > > > > > > > >> [0] https://github.com/apache/pulsar/issues/19239
> > > > > > > > > >> [1]
> > > https://lists.apache.org/thread/dd1kxhodjvovtb8yyojkk209st4o0ft2
> > > > > > > > > >>
> > > > > > >
> > > > > > >
> > >
>

Re: [DISCUSS] Topic name restriction

Posted by ma...@gmail.com.
Hi, Asaf

We are using the regular expression to check the name.
"^[-=:.\\w]*$"
The \w means [A-Za-z0-9_] , which includes underscores.

Best,
Mattison
On Feb 2, 2023, 23:01 +0800, Asaf Mesika <as...@gmail.com>, wrote:
> NamedEntity is not allowing underscores - does it make sense?
>
>
>
> On Thu, Feb 2, 2023 at 8:35 AM Michael Marshall <mm...@apache.org>
> wrote:
>
> > Thanks for starting this thread, Mattison.
> >
> > > > The topic name character validation is already done by
> > > > `NamedEntity#checkName`.
> >
> > Based on my reading of the code, only the tenant and the namespace
> > names are validated using that method. There is a call [0] that looks
> > like it validates topic names, but that method is only called by
> > tests.
> >
> > > > But I have a concern that whether we should
> > > > treat all topics that start with the long underscore ("__") as system
> > > > topics?
> >
> > This is a reasonable concern, and my primary motivation in proposing
> > this change is to make it easier for the broker to handle system
> > topics, which often get unique treatment.
> >
> > I wrote on this topic in several replies on this thread from a year ago
> > [1].
> >
> > In the context of PIP 242, we're introducing a config to optionally
> > enforce strict topic names. As such, we could rely on the config to
> > either use the "cheap" check to see if the topic starts with __ or we
> > could use the more expensive check to determine if the topic name is
> > one of many possible system topic names. Because we want to maintain
> > backwards compatibility, we cannot completely get rid of the old
> > logic. I like self describing names because they are elegant and
> > efficient.
> >
> > > > If yes, how would you like to allow users to access the system topics?
> >
> > I proposed some ideas at the end of that thread [1]. We should have a
> > clear definition of system topics and how they are or are not accessed by
> > users. Ultimately, we continue to create new system topics without
> > reserving a designated naming structure and without defining how these
> > topics ought to be interacted with, as Yunze points out. Note that any
> > system topic we introduce could conflict with existing user topics, so
> > proactively reserving a set of names makes it easier for forwards
> > compatibility.
> >
> > Thanks,
> > Michael
> >
> > [0]
> > https://github.com/apache/pulsar/blob/b880b1d240ade864181935aa360bfca03a5aa67f/pulsar-common/src/main/java/org/apache/pulsar/common/naming/NamespaceName.java#L159
> > [1] https://lists.apache.org/thread/pj4n4wzm3do8nkc52l7g7obh0sktzm17
> >
> >
> > On Wed, Feb 1, 2023 at 11:28 PM rxl@apache.org <ra...@gmail.com>
> > wrote:
> > > >
> > > > Hi Mattison:
> > > >
> > > > Agree with Yong's idea. We can expose `disallowed topic` as a
> > configuration
> > > > to the user side, and a more flexible way is to expose it as a
> > > > namespace-level policy. This can ensure that there is no need to do
> > special
> > > > processing on customized keywords in the future, and the expected effect
> > > > can be achieved by modifying the configuration.
> > > >
> > > > Think Yunze's concerns are justified for the system topic. Is it okay if
> > we
> > > > use hard code? Because the identification of any keyword is likely to be
> > > > hit by the user. The hard code method is used to filter out system topics
> > > > and not allow users to operate during delete and create operations.
> > > >
> > > > --
> > > > Thanks
> > > > Xiaolong Ran
> > > >
> > > > Dave Fisher <wa...@comcast.net> 于2023年2月2日周四 11:26写道:
> > > >
> > > > > >
> > > > > >
> > > > > > Sent from my iPhone
> > > > > >
> > > > > > > > On Feb 1, 2023, at 6:52 PM, Yong Zhang <zh...@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > Mattison,
> > > > > > > >
> > > > > > > > I agree with you about restricting the topic name.
> > > > > > > >
> > > > > > > > How about using a blacklist way to restrict it?
> > > > > >
> > > > > > If we do then please call it by another name like “disallowed”.
> > > > > >
> > > > > > > >
> > > > > > > > We can have a blacklist on the topic name restriction and make it
> > > > > > > > configurable. Add the keywords you mentioned in the default
> > > > > > configuration.
> > > > > > > > That would have a more general way to block a topic name creation.
> > > > > > > > If we have more restrictions on the topic name in the future, this
> > way
> > > > > > > > can make it easy to fit them without changing any code.
> > > > > >
> > > > > > Is there anyone asking for this feature?
> > > > > >
> > > > > > Best,
> > > > > > Dave
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Yong
> > > > > > > >
> > > > > > > > >> On Thu, 2 Feb 2023 at 07:33, <ma...@gmail.com> wrote:
> > > > > > > > >>
> > > > > > > > >> Hi, All
> > > > > > > > >>
> > > > > > > > >> In the current implementation, pulsar didn't support topic name
> > > > > > > > >> restriction. It's a good chance to discuss it.
> > > > > > > > >>
> > > > > > > > >> I think this discussion aims to identify what types of topic names
> > we
> > > > > > all
> > > > > > > > >> need to restrict.
> > > > > > > > >>
> > > > > > > > >> I know three topic names that need to be restricted at the moment.
> > > > > > > > >>
> > > > > > > > >> 1. The `-partition-` keyword.
> > > > > > > > >> 2. Topic name characters validation.
> > > > > > > > >> 3. System topic prefix `__`.
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> Please feel free to leave your comments.
> > > > > > > > >> I will keep this discussion for a week. If there are no more new
> > types
> > > > > > of
> > > > > > > > >> restrictions, I will refine the previous PIP-242[0] to explain more
> > > > > > details.
> > > > > > > > > >>> If we have other restrictions behind this discussion. We can draft
> > a
> > > > > > new
> > > > > > > > >> PIP to add it directly.
> > > > > > > > >> Thanks to Michael's opinion[1], we can expand the PIP-242 scopes to
> > help
> > > > > > > > >> pulsar have a good topic name restriction.
> > > > > > > > >>
> > > > > > > > >> Best,
> > > > > > > > >> Mattison
> > > > > > > > >>
> > > > > > > > >> [0] https://github.com/apache/pulsar/issues/19239
> > > > > > > > >> [1]
> > https://lists.apache.org/thread/dd1kxhodjvovtb8yyojkk209st4o0ft2
> > > > > > > > >>
> > > > > >
> > > > > >
> >

Re: [DISCUSS] Topic name restriction

Posted by Asaf Mesika <as...@gmail.com>.
NamedEntity is not allowing underscores - does it make sense?



On Thu, Feb 2, 2023 at 8:35 AM Michael Marshall <mm...@apache.org>
wrote:

> Thanks for starting this thread, Mattison.
>
> > The topic name character validation is already done by
> > `NamedEntity#checkName`.
>
> Based on my reading of the code, only the tenant and the namespace
> names are validated using that method. There is a call [0] that looks
> like it validates topic names, but that method is only called by
> tests.
>
> > But I have a concern that whether we should
> > treat all topics that start with the long underscore ("__") as system
> > topics?
>
> This is a reasonable concern, and my primary motivation in proposing
> this change is to make it easier for the broker to handle system
> topics, which often get unique treatment.
>
> I wrote on this topic in several replies on this thread from a year ago
> [1].
>
> In the context of PIP 242, we're introducing a config to optionally
> enforce strict topic names. As such, we could rely on the config to
> either use the "cheap" check to see if the topic starts with __ or we
> could use the more expensive check to determine if the topic name is
> one of many possible system topic names. Because we want to maintain
> backwards compatibility, we cannot completely get rid of the old
> logic. I like self describing names because they are elegant and
> efficient.
>
> > If yes, how would you like to allow users to access the system topics?
>
> I proposed some ideas at the end of that thread [1]. We should have a
> clear definition of system topics and how they are or are not accessed by
> users. Ultimately, we continue to create new system topics without
> reserving a designated naming structure and without defining how these
> topics ought to be interacted with, as Yunze points out. Note that any
> system topic we introduce could conflict with existing user topics, so
> proactively reserving a set of names makes it easier for forwards
> compatibility.
>
> Thanks,
> Michael
>
> [0]
> https://github.com/apache/pulsar/blob/b880b1d240ade864181935aa360bfca03a5aa67f/pulsar-common/src/main/java/org/apache/pulsar/common/naming/NamespaceName.java#L159
> [1] https://lists.apache.org/thread/pj4n4wzm3do8nkc52l7g7obh0sktzm17
>
>
> On Wed, Feb 1, 2023 at 11:28 PM rxl@apache.org <ra...@gmail.com>
> wrote:
> >
> > Hi Mattison:
> >
> > Agree with Yong's idea. We can expose `disallowed topic` as a
> configuration
> > to the user side, and a more flexible way is to expose it as a
> > namespace-level policy. This can ensure that there is no need to do
> special
> > processing on customized keywords in the future, and the expected effect
> > can be achieved by modifying the configuration.
> >
> > Think Yunze's concerns are justified for the system topic. Is it okay if
> we
> > use hard code? Because the identification of any keyword is likely to be
> > hit by the user. The hard code method is used to filter out system topics
> > and not allow users to operate during delete and create operations.
> >
> > --
> > Thanks
> > Xiaolong Ran
> >
> > Dave Fisher <wa...@comcast.net> 于2023年2月2日周四 11:26写道:
> >
> > >
> > >
> > > Sent from my iPhone
> > >
> > > > On Feb 1, 2023, at 6:52 PM, Yong Zhang <zh...@gmail.com>
> > > wrote:
> > > >
> > > > Mattison,
> > > >
> > > > I agree with you about restricting the topic name.
> > > >
> > > > How about using a blacklist way to restrict it?
> > >
> > > If we do then please call it by another name like “disallowed”.
> > >
> > > >
> > > > We can have a blacklist on the topic name restriction and make it
> > > > configurable. Add the keywords you mentioned in the default
> > > configuration.
> > > > That would have a more general way to block a topic name creation.
> > > > If we have more restrictions on the topic name in the future, this
> way
> > > > can make it easy to fit them without changing any code.
> > >
> > > Is there anyone asking for this feature?
> > >
> > > Best,
> > > Dave
> > > >
> > > > Thanks,
> > > > Yong
> > > >
> > > >> On Thu, 2 Feb 2023 at 07:33, <ma...@gmail.com> wrote:
> > > >>
> > > >> Hi, All
> > > >>
> > > >> In the current implementation, pulsar didn't support topic name
> > > >> restriction. It's a good chance to discuss it.
> > > >>
> > > >> I think this discussion aims to identify what types of topic names
> we
> > > all
> > > >> need to restrict.
> > > >>
> > > >> I know three topic names that need to be restricted at the moment.
> > > >>
> > > >> 1. The `-partition-` keyword.
> > > >> 2. Topic name characters validation.
> > > >> 3. System topic prefix `__`.
> > > >>
> > > >>
> > > >> Please feel free to leave your comments.
> > > >> I will keep this discussion for a week. If there are no more new
> types
> > > of
> > > >> restrictions, I will refine the previous PIP-242[0] to explain more
> > > details.
> > > >>> If we have other restrictions behind this discussion. We can draft
> a
> > > new
> > > >> PIP to add it directly.
> > > >> Thanks to Michael's opinion[1], we can expand the PIP-242 scopes to
> help
> > > >> pulsar have a good topic name restriction.
> > > >>
> > > >> Best,
> > > >> Mattison
> > > >>
> > > >> [0] https://github.com/apache/pulsar/issues/19239
> > > >> [1]
> https://lists.apache.org/thread/dd1kxhodjvovtb8yyojkk209st4o0ft2
> > > >>
> > >
> > >
>

Re: [DISCUSS] Topic name restriction

Posted by Michael Marshall <mm...@apache.org>.
Thanks for starting this thread, Mattison.

> The topic name character validation is already done by
> `NamedEntity#checkName`.

Based on my reading of the code, only the tenant and the namespace
names are validated using that method. There is a call [0] that looks
like it validates topic names, but that method is only called by
tests.

> But I have a concern that whether we should
> treat all topics that start with the long underscore ("__") as system
> topics?

This is a reasonable concern, and my primary motivation in proposing
this change is to make it easier for the broker to handle system
topics, which often get unique treatment.

I wrote on this topic in several replies on this thread from a year ago [1].

In the context of PIP 242, we're introducing a config to optionally
enforce strict topic names. As such, we could rely on the config to
either use the "cheap" check to see if the topic starts with __ or we
could use the more expensive check to determine if the topic name is
one of many possible system topic names. Because we want to maintain
backwards compatibility, we cannot completely get rid of the old
logic. I like self describing names because they are elegant and
efficient.

> If yes, how would you like to allow users to access the system topics?

I proposed some ideas at the end of that thread [1]. We should have a
clear definition of system topics and how they are or are not accessed by
users. Ultimately, we continue to create new system topics without
reserving a designated naming structure and without defining how these
topics ought to be interacted with, as Yunze points out. Note that any
system topic we introduce could conflict with existing user topics, so
proactively reserving a set of names makes it easier for forwards
compatibility.

Thanks,
Michael

[0] https://github.com/apache/pulsar/blob/b880b1d240ade864181935aa360bfca03a5aa67f/pulsar-common/src/main/java/org/apache/pulsar/common/naming/NamespaceName.java#L159
[1] https://lists.apache.org/thread/pj4n4wzm3do8nkc52l7g7obh0sktzm17


On Wed, Feb 1, 2023 at 11:28 PM rxl@apache.org <ra...@gmail.com> wrote:
>
> Hi Mattison:
>
> Agree with Yong's idea. We can expose `disallowed topic` as a configuration
> to the user side, and a more flexible way is to expose it as a
> namespace-level policy. This can ensure that there is no need to do special
> processing on customized keywords in the future, and the expected effect
> can be achieved by modifying the configuration.
>
> Think Yunze's concerns are justified for the system topic. Is it okay if we
> use hard code? Because the identification of any keyword is likely to be
> hit by the user. The hard code method is used to filter out system topics
> and not allow users to operate during delete and create operations.
>
> --
> Thanks
> Xiaolong Ran
>
> Dave Fisher <wa...@comcast.net> 于2023年2月2日周四 11:26写道:
>
> >
> >
> > Sent from my iPhone
> >
> > > On Feb 1, 2023, at 6:52 PM, Yong Zhang <zh...@gmail.com>
> > wrote:
> > >
> > > Mattison,
> > >
> > > I agree with you about restricting the topic name.
> > >
> > > How about using a blacklist way to restrict it?
> >
> > If we do then please call it by another name like “disallowed”.
> >
> > >
> > > We can have a blacklist on the topic name restriction and make it
> > > configurable. Add the keywords you mentioned in the default
> > configuration.
> > > That would have a more general way to block a topic name creation.
> > > If we have more restrictions on the topic name in the future, this way
> > > can make it easy to fit them without changing any code.
> >
> > Is there anyone asking for this feature?
> >
> > Best,
> > Dave
> > >
> > > Thanks,
> > > Yong
> > >
> > >> On Thu, 2 Feb 2023 at 07:33, <ma...@gmail.com> wrote:
> > >>
> > >> Hi, All
> > >>
> > >> In the current implementation, pulsar didn't support topic name
> > >> restriction. It's a good chance to discuss it.
> > >>
> > >> I think this discussion aims to identify what types of topic names we
> > all
> > >> need to restrict.
> > >>
> > >> I know three topic names that need to be restricted at the moment.
> > >>
> > >> 1. The `-partition-` keyword.
> > >> 2. Topic name characters validation.
> > >> 3. System topic prefix `__`.
> > >>
> > >>
> > >> Please feel free to leave your comments.
> > >> I will keep this discussion for a week. If there are no more new types
> > of
> > >> restrictions, I will refine the previous PIP-242[0] to explain more
> > details.
> > >>> If we have other restrictions behind this discussion. We can draft a
> > new
> > >> PIP to add it directly.
> > >> Thanks to Michael's opinion[1], we can expand the PIP-242 scopes to help
> > >> pulsar have a good topic name restriction.
> > >>
> > >> Best,
> > >> Mattison
> > >>
> > >> [0] https://github.com/apache/pulsar/issues/19239
> > >> [1] https://lists.apache.org/thread/dd1kxhodjvovtb8yyojkk209st4o0ft2
> > >>
> >
> >

Re: [DISCUSS] Topic name restriction

Posted by "rxl@apache.org" <ra...@gmail.com>.
Hi Mattison:

Agree with Yong's idea. We can expose `disallowed topic` as a configuration
to the user side, and a more flexible way is to expose it as a
namespace-level policy. This can ensure that there is no need to do special
processing on customized keywords in the future, and the expected effect
can be achieved by modifying the configuration.

Think Yunze's concerns are justified for the system topic. Is it okay if we
use hard code? Because the identification of any keyword is likely to be
hit by the user. The hard code method is used to filter out system topics
and not allow users to operate during delete and create operations.

--
Thanks
Xiaolong Ran

Dave Fisher <wa...@comcast.net> 于2023年2月2日周四 11:26写道:

>
>
> Sent from my iPhone
>
> > On Feb 1, 2023, at 6:52 PM, Yong Zhang <zh...@gmail.com>
> wrote:
> >
> > Mattison,
> >
> > I agree with you about restricting the topic name.
> >
> > How about using a blacklist way to restrict it?
>
> If we do then please call it by another name like “disallowed”.
>
> >
> > We can have a blacklist on the topic name restriction and make it
> > configurable. Add the keywords you mentioned in the default
> configuration.
> > That would have a more general way to block a topic name creation.
> > If we have more restrictions on the topic name in the future, this way
> > can make it easy to fit them without changing any code.
>
> Is there anyone asking for this feature?
>
> Best,
> Dave
> >
> > Thanks,
> > Yong
> >
> >> On Thu, 2 Feb 2023 at 07:33, <ma...@gmail.com> wrote:
> >>
> >> Hi, All
> >>
> >> In the current implementation, pulsar didn't support topic name
> >> restriction. It's a good chance to discuss it.
> >>
> >> I think this discussion aims to identify what types of topic names we
> all
> >> need to restrict.
> >>
> >> I know three topic names that need to be restricted at the moment.
> >>
> >> 1. The `-partition-` keyword.
> >> 2. Topic name characters validation.
> >> 3. System topic prefix `__`.
> >>
> >>
> >> Please feel free to leave your comments.
> >> I will keep this discussion for a week. If there are no more new types
> of
> >> restrictions, I will refine the previous PIP-242[0] to explain more
> details.
> >>> If we have other restrictions behind this discussion. We can draft a
> new
> >> PIP to add it directly.
> >> Thanks to Michael's opinion[1], we can expand the PIP-242 scopes to help
> >> pulsar have a good topic name restriction.
> >>
> >> Best,
> >> Mattison
> >>
> >> [0] https://github.com/apache/pulsar/issues/19239
> >> [1] https://lists.apache.org/thread/dd1kxhodjvovtb8yyojkk209st4o0ft2
> >>
>
>

Re: [DISCUSS] Topic name restriction

Posted by Dave Fisher <wa...@comcast.net>.

Sent from my iPhone

> On Feb 1, 2023, at 6:52 PM, Yong Zhang <zh...@gmail.com> wrote:
> 
> Mattison,
> 
> I agree with you about restricting the topic name.
> 
> How about using a blacklist way to restrict it?

If we do then please call it by another name like “disallowed”.

> 
> We can have a blacklist on the topic name restriction and make it
> configurable. Add the keywords you mentioned in the default configuration.
> That would have a more general way to block a topic name creation.
> If we have more restrictions on the topic name in the future, this way
> can make it easy to fit them without changing any code.

Is there anyone asking for this feature?

Best,
Dave
> 
> Thanks,
> Yong
> 
>> On Thu, 2 Feb 2023 at 07:33, <ma...@gmail.com> wrote:
>> 
>> Hi, All
>> 
>> In the current implementation, pulsar didn't support topic name
>> restriction. It's a good chance to discuss it.
>> 
>> I think this discussion aims to identify what types of topic names we all
>> need to restrict.
>> 
>> I know three topic names that need to be restricted at the moment.
>> 
>> 1. The `-partition-` keyword.
>> 2. Topic name characters validation.
>> 3. System topic prefix `__`.
>> 
>> 
>> Please feel free to leave your comments.
>> I will keep this discussion for a week. If there are no more new types of
>> restrictions, I will refine the previous PIP-242[0] to explain more details.
>>> If we have other restrictions behind this discussion. We can draft a new
>> PIP to add it directly.
>> Thanks to Michael's opinion[1], we can expand the PIP-242 scopes to help
>> pulsar have a good topic name restriction.
>> 
>> Best,
>> Mattison
>> 
>> [0] https://github.com/apache/pulsar/issues/19239
>> [1] https://lists.apache.org/thread/dd1kxhodjvovtb8yyojkk209st4o0ft2
>> 


Re: [DISCUSS] Topic name restriction

Posted by ma...@gmail.com.
Hi, Yong

> How about using a blacklist way to restrict it?

It's a great idea. Maybe we can define the rules like  `keyword` with regular expressions.

Best,
Mattison


On Feb 2, 2023, 10:52 +0800, Yong Zhang <zh...@gmail.com>, wrote:
> Mattison,
>
> I agree with you about restricting the topic name.
>
> How about using a blacklist way to restrict it?
>
> We can have a blacklist on the topic name restriction and make it
> configurable. Add the keywords you mentioned in the default configuration.
> That would have a more general way to block a topic name creation.
> If we have more restrictions on the topic name in the future, this way
> can make it easy to fit them without changing any code.
>
> Thanks,
> Yong
>
> On Thu, 2 Feb 2023 at 07:33, <ma...@gmail.com> wrote:
>
> > Hi, All
> >
> > In the current implementation, pulsar didn't support topic name
> > restriction. It's a good chance to discuss it.
> >
> > I think this discussion aims to identify what types of topic names we all
> > need to restrict.
> >
> > I know three topic names that need to be restricted at the moment.
> >
> > 1. The `-partition-` keyword.
> > 2. Topic name characters validation.
> > 3. System topic prefix `__`.
> >
> >
> > Please feel free to leave your comments.
> > I will keep this discussion for a week. If there are no more new types of
> > restrictions, I will refine the previous PIP-242[0] to explain more details.
> > > > If we have other restrictions behind this discussion. We can draft a new
> > PIP to add it directly.
> > Thanks to Michael's opinion[1], we can expand the PIP-242 scopes to help
> > pulsar have a good topic name restriction.
> >
> > Best,
> > Mattison
> >
> > [0] https://github.com/apache/pulsar/issues/19239
> > [1] https://lists.apache.org/thread/dd1kxhodjvovtb8yyojkk209st4o0ft2
> >

Re: [DISCUSS] Topic name restriction

Posted by ma...@gmail.com.
Hi, Yunze
> The topic name character validation is already done by`NamedEntity#checkName`
As Michael mentioned, the `NamedEntity#checkName` just checked the tenant and namespace.
> But I have a concern that whether we shouldtreat all topics that start with the long underscore ("__") as systemtopics? Users might have defined their own "system topics" that startwith the long underscore for special uses.If yes, how would you like to allow users to access the system topics?In Kafka, there are a few (only two, IIRC) special topics that onlyallow non-admin users to read, which means users cannot write to thesetopics or delete them.
We can add the warn log when the user creates a topic with the `__` start. To tell users that we will mark this as the system topic prefix keyword. It will probably be banned in the future.
Or we can just add the existent system topic name as the keywords and introduce a new complex system topic rule.

Best,
Mattison
On Feb 2, 2023, 11:11 +0800, Yunze Xu <yz...@streamnative.io.invalid>, wrote:
> The topic name character validation is already done by
> `NamedEntity#checkName`. And I agree that the system topic should be
> taken carefully as well. But I have a concern that whether we should
> treat all topics that start with the long underscore ("__") as system
> topics? Users might have defined their own "system topics" that start
> with the long underscore for special uses.
>
> If yes, how would you like to allow users to access the system topics?
> In Kafka, there are a few (only two, IIRC) special topics that only
> allow non-admin users to read, which means users cannot write to these
> topics or delete them.
>
> Thanks,
> Yunze
>
> On Thu, Feb 2, 2023 at 10:52 AM Yong Zhang <zh...@gmail.com> wrote:
> >
> > Mattison,
> >
> > I agree with you about restricting the topic name.
> >
> > How about using a blacklist way to restrict it?
> >
> > We can have a blacklist on the topic name restriction and make it
> > configurable. Add the keywords you mentioned in the default configuration.
> > That would have a more general way to block a topic name creation.
> > If we have more restrictions on the topic name in the future, this way
> > can make it easy to fit them without changing any code.
> >
> > Thanks,
> > Yong
> >
> > On Thu, 2 Feb 2023 at 07:33, <ma...@gmail.com> wrote:
> >
> > > > Hi, All
> > > >
> > > > In the current implementation, pulsar didn't support topic name
> > > > restriction. It's a good chance to discuss it.
> > > >
> > > > I think this discussion aims to identify what types of topic names we all
> > > > need to restrict.
> > > >
> > > > I know three topic names that need to be restricted at the moment.
> > > >
> > > > 1. The `-partition-` keyword.
> > > > 2. Topic name characters validation.
> > > > 3. System topic prefix `__`.
> > > >
> > > >
> > > > Please feel free to leave your comments.
> > > > I will keep this discussion for a week. If there are no more new types of
> > > > restrictions, I will refine the previous PIP-242[0] to explain more details.
> > > > > > If we have other restrictions behind this discussion. We can draft a new
> > > > PIP to add it directly.
> > > > Thanks to Michael's opinion[1], we can expand the PIP-242 scopes to help
> > > > pulsar have a good topic name restriction.
> > > >
> > > > Best,
> > > > Mattison
> > > >
> > > > [0] https://github.com/apache/pulsar/issues/19239
> > > > [1] https://lists.apache.org/thread/dd1kxhodjvovtb8yyojkk209st4o0ft2
> > > >

Re: [DISCUSS] Topic name restriction

Posted by Yunze Xu <yz...@streamnative.io.INVALID>.
The topic name character validation is already done by
`NamedEntity#checkName`. And I agree that the system topic should be
taken carefully as well. But I have a concern that whether we should
treat all topics that start with the long underscore ("__") as system
topics? Users might have defined their own "system topics" that start
with the long underscore for special uses.

If yes, how would you like to allow users to access the system topics?
In Kafka, there are a few (only two, IIRC) special topics that only
allow non-admin users to read, which means users cannot write to these
topics or delete them.

Thanks,
Yunze

On Thu, Feb 2, 2023 at 10:52 AM Yong Zhang <zh...@gmail.com> wrote:
>
> Mattison,
>
> I agree with you about restricting the topic name.
>
> How about using a blacklist way to restrict it?
>
> We can have a blacklist on the topic name restriction and make it
> configurable. Add the keywords you mentioned in the default configuration.
> That would have a more general way to block a topic name creation.
> If we have more restrictions on the topic name in the future, this way
> can make it easy to fit them without changing any code.
>
> Thanks,
> Yong
>
> On Thu, 2 Feb 2023 at 07:33, <ma...@gmail.com> wrote:
>
> > Hi, All
> >
> > In the current implementation, pulsar didn't support topic name
> > restriction. It's a good chance to discuss it.
> >
> > I think this discussion aims to identify what types of topic names we all
> > need to restrict.
> >
> > I know three topic names that need to be restricted at the moment.
> >
> > 1. The `-partition-` keyword.
> > 2. Topic name characters validation.
> > 3. System topic prefix `__`.
> >
> >
> > Please feel free to leave your comments.
> > I will keep this discussion for a week. If there are no more new types of
> > restrictions, I will refine the previous PIP-242[0] to explain more details.
> > > If we have other restrictions behind this discussion. We can draft a new
> > PIP to add it directly.
> > Thanks to Michael's opinion[1], we can expand the PIP-242 scopes to help
> > pulsar have a good topic name restriction.
> >
> > Best,
> > Mattison
> >
> > [0] https://github.com/apache/pulsar/issues/19239
> > [1] https://lists.apache.org/thread/dd1kxhodjvovtb8yyojkk209st4o0ft2
> >

Re: [DISCUSS] Topic name restriction

Posted by Yong Zhang <zh...@gmail.com>.
Mattison,

I agree with you about restricting the topic name.

How about using a blacklist way to restrict it?

We can have a blacklist on the topic name restriction and make it
configurable. Add the keywords you mentioned in the default configuration.
That would have a more general way to block a topic name creation.
If we have more restrictions on the topic name in the future, this way
can make it easy to fit them without changing any code.

Thanks,
Yong

On Thu, 2 Feb 2023 at 07:33, <ma...@gmail.com> wrote:

> Hi, All
>
> In the current implementation, pulsar didn't support topic name
> restriction. It's a good chance to discuss it.
>
> I think this discussion aims to identify what types of topic names we all
> need to restrict.
>
> I know three topic names that need to be restricted at the moment.
>
> 1. The `-partition-` keyword.
> 2. Topic name characters validation.
> 3. System topic prefix `__`.
>
>
> Please feel free to leave your comments.
> I will keep this discussion for a week. If there are no more new types of
> restrictions, I will refine the previous PIP-242[0] to explain more details.
> > If we have other restrictions behind this discussion. We can draft a new
> PIP to add it directly.
> Thanks to Michael's opinion[1], we can expand the PIP-242 scopes to help
> pulsar have a good topic name restriction.
>
> Best,
> Mattison
>
> [0] https://github.com/apache/pulsar/issues/19239
> [1] https://lists.apache.org/thread/dd1kxhodjvovtb8yyojkk209st4o0ft2
>