You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Cheng Tan <ct...@confluent.io> on 2020/07/07 17:40:34 UTC

Re: [DISCUSSION] KIP-619: Add internal topic creation support

Hi Colin,


Thanks for the comments. I’ve modified the KIP accordingly.

> I think we need to understand which of these limitations we will carry forward and which we will not.  We also have the option of putting limitations just on consumer offsets, but not on other internal topics.


In the proposal, I added details about this. I agree that cluster admin should use ACLs to apply the restrictions. 
Internal topic creation will be allowed.
Internal topic deletion will be allowed except for` __consumer_offsets` and `__transaction_state`.
Producing to internal topic partitions other than `__consumer_offsets` and `__transaction_state` will be allowed.
Adding internal topic partitions to transactions will be allowed.
> I think there are a fair number of compatibility concerns.  What's the result if someone tries to create a topic with the configuration internal = true right now?  Does it fail?  If not, that seems like a potential problem.

I also added this compatibility issue in the "Compatibility, Deprecation, and Migration Plan" section.

Please feel free to make any suggestions or comments regarding to my latest proposal. Thanks.


Best, - Cheng Tan






> On Jun 15, 2020, at 11:18 AM, Colin McCabe <cm...@apache.org> wrote:
> 
> Hi Cheng,
> 
> The link from the main KIP page is an "edit link" meaning that it drops you into the editor for the wiki page.  I think the link you meant to use is a "view link" that will just take you to view the page.
> 
> In general I'm not sure what I'm supposed to take away from the large UML diagram in the KIP.  This is just a description of the existing code, right?  Seems like we should remove this.
> 
> I'm not sure why the controller classes are featured here since as far as I can tell, the controller doesn't need to care if a topic is internal.
> 
>> Kafka and its upstream applications treat internal topics differently from
>> non-internal topics. For example:
>> * Kafka handles topic creation response errors differently for internal topics
>> * Internal topic partitions cannot be added to a transaction
>> * Internal topic records cannot be deleted
>> * Appending to internal topics might get rejected
> 
> I think we need to understand which of these limitations we will carry forward and which we will not.  We also have the option of putting limitations just on consumer offsets, but not on other internal topics.
> 
> Taking it one by one:
> 
>> * Kafka handles topic creation response errors differently for internal topics.
> 
> Hmm.  Kafka doesn't currently allow you to create internal topics, so the difference here is that you always fail, right?  Or is there something else more subtle here?  Like do we specifically prevent you from creating topics named __consumer_offsets or something?  We need to spell this all out in the KIP.
> 
>> * Internal topic partitions cannot be added to a transaction
> 
> I don't think we should carry this limitation forward, or if we do, we should only do it for consumer-offsets.  Does anyone know why this limitation exists?
> 
>> * Internal topic records cannot be deleted
> 
> This seems like something that should be handled by ACLs rather than by treating internal topics specially.
> 
>> * Appending to internal topics might get rejected
> 
> We clearly need to use ACLs here rather than rejecting appends.  Otherwise, how will external systems like KSQL, streams, etc. use this feature?  This is the kind of information we need to have in the KIP.
> 
>> Public Interfaces
>> 2. KafkaZkClient will have a new method getInternalTopics() which 
>> returns a set of internal topic name strings.
> 
> KafkaZkClient isn't a public interface, so it doesn't need to be described here.
> 
>> There are no compatibility concerns in this KIP.
> 
> I think there are a fair number of compatibility concerns.  What's the result if someone tries to create a topic with the configuration internal = true right now?  Does it fail?  If not, that seems like a potential problem.
> 
> Are people going to be able to create or delete topics named __consumer_offsets or __transaction_state using this mechanism?  If so, how does the security model work for that?
> 
> best,
> Colin
> 
> On Fri, May 29, 2020, at 01:09, Cheng Tan wrote:
>> Hello developers,
>> 
>> 
>> I’m proposing KIP-619 to add internal topic creation support. 
>> 
>> Kafka and its upstream applications treat internal topics differently 
>> from non-internal topics. For example:
>> 
>> 	• Kafka handles topic creation response errors differently for internal topics
>> 	• Internal topic partitions cannot be added to a transaction
>> 	• Internal topic records cannot be deleted
>> 	• Appending to internal topics might get rejected
>> 	• ……
>> 
>> Clients and upstream applications may define their own internal topics. 
>> For example, Kafka Connect defines `connect-configs`, 
>> `connect-offsets`, and `connect-statuses`. Clients are fetching the 
>> internal topics by sending the MetadataRequest (ApiKeys.METADATA).
>> 
>> However, clients and upstream application cannot register their own 
>> internal topics in servers. As a result, servers have no knowledge 
>> about client-defined internal topics. They can only test if a given 
>> topic is internal or not simply by checking against a static set of 
>> internal topic string, which consists of two internal topic names 
>> `__consumer_offsets` and `__transaction_state`. As a result, 
>> MetadataRequest cannot provide any information about client created 
>> internal topics. 
>> 
>> To solve this pain point, I'm proposing support for clients to register 
>> and query their own internal topics. 
>> 
>> Please feel free to join the discussion. Thanks in advance.
>> 
>> 
>> Best, - Cheng Tan


Re: [DISCUSSION] KIP-619: Add internal topic creation support

Posted by Cheng Tan <ct...@confluent.io>.
Hi David,


Thanks for the feedback. They are really helpful.

> Can you clarify a bit more what the difference is between regular topics
> and internal topics (excluding  __consumer_offsets and
> __transaction_state)? Reading your last message, if internal topics
> (excluding the two) can be created, deleted, produced to, consumed from,
> added to transactions, I'm failing to see what is different about them. Is
> it simply that they are marked as "internal" so the application can treat
> them differently?

Yes. The user-defined internal topics (those except `__consumer_offsets` and `__transaction_state`) will behave as normal topics in regard to messaging operation and permission. Topics are marked as “internal” in order to make the broker able to test user-defined internal topics and better provide metadata services, such as `listTopics` API. I should have added the metadata behavior difference in the KIP.

> In the "Compatibility, Deprecation, and Migration" section, we should
> detail how users can overcome this incompatibility (i.e., changing the
> config name on their topic and changing their application logic if
> necessary).

Thanks for the suggestion. I updated the section.

> Should we consider adding any configs to constrain the min isr and
> replication factor for internal topics? If a topic is really internal and
> fundamentally required for an application to function, it might need a more
> stringent replication config. Our existing internal topics have their own
> configs in server.properties with a comment saying as much.


I think we should probably give clients the freedom to configure `min.insync.replicas`, `replication.factor`, and `log.retention` on user-defined internal topics as they do on normal topics.

1. Users may have performance requirements on user-defined internal topics.
2. Potential new defaults / restrictions may change the existing user application logic silently. There might be compatibility issues.
3. Since user-defined internal topics act like normal topics and won’t affect the messaging functionality (produce, consume, transaction, etc), unoptimized log configurations won’t harm the cluster. 


Please let me know what you think. Thanks.


Best, - Cheng Tan



> On Aug 14, 2020, at 7:44 AM, David Arthur <da...@confluent.io> wrote:
> 
> Cheng,
> 
> Can you clarify a bit more what the difference is between regular topics
> and internal topics (excluding  __consumer_offsets and
> __transaction_state)? Reading your last message, if internal topics
> (excluding the two) can be created, deleted, produced to, consumed from,
> added to transactions, I'm failing to see what is different about them. Is
> it simply that they are marked as "internal" so the application can treat
> them differently?
> 
> 
> In the "Compatibility, Deprecation, and Migration" section, we should
> detail how users can overcome this incompatibility (i.e., changing the
> config name on their topic and changing their application logic if
> necessary).
> 
> 
> Should we consider adding any configs to constrain the min isr and
> replication factor for internal topics? If a topic is really internal and
> fundamentally required for an application to function, it might need a more
> stringent replication config. Our existing internal topics have their own
> configs in server.properties with a comment saying as much.
> 
> 
> Thanks!
> David
> 
> 
> 
> On Tue, Jul 7, 2020 at 1:40 PM Cheng Tan <ct...@confluent.io> wrote:
> 
>> Hi Colin,
>> 
>> 
>> Thanks for the comments. I’ve modified the KIP accordingly.
>> 
>>> I think we need to understand which of these limitations we will carry
>> forward and which we will not.  We also have the option of putting
>> limitations just on consumer offsets, but not on other internal topics.
>> 
>> 
>> In the proposal, I added details about this. I agree that cluster admin
>> should use ACLs to apply the restrictions.
>> Internal topic creation will be allowed.
>> Internal topic deletion will be allowed except for` __consumer_offsets`
>> and `__transaction_state`.
>> Producing to internal topic partitions other than `__consumer_offsets` and
>> `__transaction_state` will be allowed.
>> Adding internal topic partitions to transactions will be allowed.
>>> I think there are a fair number of compatibility concerns.  What's the
>> result if someone tries to create a topic with the configuration internal =
>> true right now?  Does it fail?  If not, that seems like a potential problem.
>> 
>> I also added this compatibility issue in the "Compatibility, Deprecation,
>> and Migration Plan" section.
>> 
>> Please feel free to make any suggestions or comments regarding to my
>> latest proposal. Thanks.
>> 
>> 
>> Best, - Cheng Tan
>> 
>> 
>> 
>> 
>> 
>> 
>>> On Jun 15, 2020, at 11:18 AM, Colin McCabe <cm...@apache.org> wrote:
>>> 
>>> Hi Cheng,
>>> 
>>> The link from the main KIP page is an "edit link" meaning that it drops
>> you into the editor for the wiki page.  I think the link you meant to use
>> is a "view link" that will just take you to view the page.
>>> 
>>> In general I'm not sure what I'm supposed to take away from the large
>> UML diagram in the KIP.  This is just a description of the existing code,
>> right?  Seems like we should remove this.
>>> 
>>> I'm not sure why the controller classes are featured here since as far
>> as I can tell, the controller doesn't need to care if a topic is internal.
>>> 
>>>> Kafka and its upstream applications treat internal topics differently
>> from
>>>> non-internal topics. For example:
>>>> * Kafka handles topic creation response errors differently for internal
>> topics
>>>> * Internal topic partitions cannot be added to a transaction
>>>> * Internal topic records cannot be deleted
>>>> * Appending to internal topics might get rejected
>>> 
>>> I think we need to understand which of these limitations we will carry
>> forward and which we will not.  We also have the option of putting
>> limitations just on consumer offsets, but not on other internal topics.
>>> 
>>> Taking it one by one:
>>> 
>>>> * Kafka handles topic creation response errors differently for internal
>> topics.
>>> 
>>> Hmm.  Kafka doesn't currently allow you to create internal topics, so
>> the difference here is that you always fail, right?  Or is there something
>> else more subtle here?  Like do we specifically prevent you from creating
>> topics named __consumer_offsets or something?  We need to spell this all
>> out in the KIP.
>>> 
>>>> * Internal topic partitions cannot be added to a transaction
>>> 
>>> I don't think we should carry this limitation forward, or if we do, we
>> should only do it for consumer-offsets.  Does anyone know why this
>> limitation exists?
>>> 
>>>> * Internal topic records cannot be deleted
>>> 
>>> This seems like something that should be handled by ACLs rather than by
>> treating internal topics specially.
>>> 
>>>> * Appending to internal topics might get rejected
>>> 
>>> We clearly need to use ACLs here rather than rejecting appends.
>> Otherwise, how will external systems like KSQL, streams, etc. use this
>> feature?  This is the kind of information we need to have in the KIP.
>>> 
>>>> Public Interfaces
>>>> 2. KafkaZkClient will have a new method getInternalTopics() which
>>>> returns a set of internal topic name strings.
>>> 
>>> KafkaZkClient isn't a public interface, so it doesn't need to be
>> described here.
>>> 
>>>> There are no compatibility concerns in this KIP.
>>> 
>>> I think there are a fair number of compatibility concerns.  What's the
>> result if someone tries to create a topic with the configuration internal =
>> true right now?  Does it fail?  If not, that seems like a potential problem.
>>> 
>>> Are people going to be able to create or delete topics named
>> __consumer_offsets or __transaction_state using this mechanism?  If so, how
>> does the security model work for that?
>>> 
>>> best,
>>> Colin
>>> 
>>> On Fri, May 29, 2020, at 01:09, Cheng Tan wrote:
>>>> Hello developers,
>>>> 
>>>> 
>>>> I’m proposing KIP-619 to add internal topic creation support.
>>>> 
>>>> Kafka and its upstream applications treat internal topics differently
>>>> from non-internal topics. For example:
>>>> 
>>>>     • Kafka handles topic creation response errors differently for
>> internal topics
>>>>     • Internal topic partitions cannot be added to a transaction
>>>>     • Internal topic records cannot be deleted
>>>>     • Appending to internal topics might get rejected
>>>>     • ……
>>>> 
>>>> Clients and upstream applications may define their own internal topics.
>>>> For example, Kafka Connect defines `connect-configs`,
>>>> `connect-offsets`, and `connect-statuses`. Clients are fetching the
>>>> internal topics by sending the MetadataRequest (ApiKeys.METADATA).
>>>> 
>>>> However, clients and upstream application cannot register their own
>>>> internal topics in servers. As a result, servers have no knowledge
>>>> about client-defined internal topics. They can only test if a given
>>>> topic is internal or not simply by checking against a static set of
>>>> internal topic string, which consists of two internal topic names
>>>> `__consumer_offsets` and `__transaction_state`. As a result,
>>>> MetadataRequest cannot provide any information about client created
>>>> internal topics.
>>>> 
>>>> To solve this pain point, I'm proposing support for clients to register
>>>> and query their own internal topics.
>>>> 
>>>> Please feel free to join the discussion. Thanks in advance.
>>>> 
>>>> 
>>>> Best, - Cheng Tan
>> 
>> 
> 
> -- 
> -David


Re: [DISCUSSION] KIP-619: Add internal topic creation support

Posted by David Arthur <da...@confluent.io>.
Cheng,

Can you clarify a bit more what the difference is between regular topics
and internal topics (excluding  __consumer_offsets and
__transaction_state)? Reading your last message, if internal topics
(excluding the two) can be created, deleted, produced to, consumed from,
added to transactions, I'm failing to see what is different about them. Is
it simply that they are marked as "internal" so the application can treat
them differently?


In the "Compatibility, Deprecation, and Migration" section, we should
detail how users can overcome this incompatibility (i.e., changing the
config name on their topic and changing their application logic if
necessary).


Should we consider adding any configs to constrain the min isr and
replication factor for internal topics? If a topic is really internal and
fundamentally required for an application to function, it might need a more
stringent replication config. Our existing internal topics have their own
configs in server.properties with a comment saying as much.


Thanks!
David



On Tue, Jul 7, 2020 at 1:40 PM Cheng Tan <ct...@confluent.io> wrote:

> Hi Colin,
>
>
> Thanks for the comments. I’ve modified the KIP accordingly.
>
> > I think we need to understand which of these limitations we will carry
> forward and which we will not.  We also have the option of putting
> limitations just on consumer offsets, but not on other internal topics.
>
>
> In the proposal, I added details about this. I agree that cluster admin
> should use ACLs to apply the restrictions.
> Internal topic creation will be allowed.
> Internal topic deletion will be allowed except for` __consumer_offsets`
> and `__transaction_state`.
> Producing to internal topic partitions other than `__consumer_offsets` and
> `__transaction_state` will be allowed.
> Adding internal topic partitions to transactions will be allowed.
> > I think there are a fair number of compatibility concerns.  What's the
> result if someone tries to create a topic with the configuration internal =
> true right now?  Does it fail?  If not, that seems like a potential problem.
>
> I also added this compatibility issue in the "Compatibility, Deprecation,
> and Migration Plan" section.
>
> Please feel free to make any suggestions or comments regarding to my
> latest proposal. Thanks.
>
>
> Best, - Cheng Tan
>
>
>
>
>
>
> > On Jun 15, 2020, at 11:18 AM, Colin McCabe <cm...@apache.org> wrote:
> >
> > Hi Cheng,
> >
> > The link from the main KIP page is an "edit link" meaning that it drops
> you into the editor for the wiki page.  I think the link you meant to use
> is a "view link" that will just take you to view the page.
> >
> > In general I'm not sure what I'm supposed to take away from the large
> UML diagram in the KIP.  This is just a description of the existing code,
> right?  Seems like we should remove this.
> >
> > I'm not sure why the controller classes are featured here since as far
> as I can tell, the controller doesn't need to care if a topic is internal.
> >
> >> Kafka and its upstream applications treat internal topics differently
> from
> >> non-internal topics. For example:
> >> * Kafka handles topic creation response errors differently for internal
> topics
> >> * Internal topic partitions cannot be added to a transaction
> >> * Internal topic records cannot be deleted
> >> * Appending to internal topics might get rejected
> >
> > I think we need to understand which of these limitations we will carry
> forward and which we will not.  We also have the option of putting
> limitations just on consumer offsets, but not on other internal topics.
> >
> > Taking it one by one:
> >
> >> * Kafka handles topic creation response errors differently for internal
> topics.
> >
> > Hmm.  Kafka doesn't currently allow you to create internal topics, so
> the difference here is that you always fail, right?  Or is there something
> else more subtle here?  Like do we specifically prevent you from creating
> topics named __consumer_offsets or something?  We need to spell this all
> out in the KIP.
> >
> >> * Internal topic partitions cannot be added to a transaction
> >
> > I don't think we should carry this limitation forward, or if we do, we
> should only do it for consumer-offsets.  Does anyone know why this
> limitation exists?
> >
> >> * Internal topic records cannot be deleted
> >
> > This seems like something that should be handled by ACLs rather than by
> treating internal topics specially.
> >
> >> * Appending to internal topics might get rejected
> >
> > We clearly need to use ACLs here rather than rejecting appends.
> Otherwise, how will external systems like KSQL, streams, etc. use this
> feature?  This is the kind of information we need to have in the KIP.
> >
> >> Public Interfaces
> >> 2. KafkaZkClient will have a new method getInternalTopics() which
> >> returns a set of internal topic name strings.
> >
> > KafkaZkClient isn't a public interface, so it doesn't need to be
> described here.
> >
> >> There are no compatibility concerns in this KIP.
> >
> > I think there are a fair number of compatibility concerns.  What's the
> result if someone tries to create a topic with the configuration internal =
> true right now?  Does it fail?  If not, that seems like a potential problem.
> >
> > Are people going to be able to create or delete topics named
> __consumer_offsets or __transaction_state using this mechanism?  If so, how
> does the security model work for that?
> >
> > best,
> > Colin
> >
> > On Fri, May 29, 2020, at 01:09, Cheng Tan wrote:
> >> Hello developers,
> >>
> >>
> >> I’m proposing KIP-619 to add internal topic creation support.
> >>
> >> Kafka and its upstream applications treat internal topics differently
> >> from non-internal topics. For example:
> >>
> >>      • Kafka handles topic creation response errors differently for
> internal topics
> >>      • Internal topic partitions cannot be added to a transaction
> >>      • Internal topic records cannot be deleted
> >>      • Appending to internal topics might get rejected
> >>      • ……
> >>
> >> Clients and upstream applications may define their own internal topics.
> >> For example, Kafka Connect defines `connect-configs`,
> >> `connect-offsets`, and `connect-statuses`. Clients are fetching the
> >> internal topics by sending the MetadataRequest (ApiKeys.METADATA).
> >>
> >> However, clients and upstream application cannot register their own
> >> internal topics in servers. As a result, servers have no knowledge
> >> about client-defined internal topics. They can only test if a given
> >> topic is internal or not simply by checking against a static set of
> >> internal topic string, which consists of two internal topic names
> >> `__consumer_offsets` and `__transaction_state`. As a result,
> >> MetadataRequest cannot provide any information about client created
> >> internal topics.
> >>
> >> To solve this pain point, I'm proposing support for clients to register
> >> and query their own internal topics.
> >>
> >> Please feel free to join the discussion. Thanks in advance.
> >>
> >>
> >> Best, - Cheng Tan
>
>

-- 
-David