You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by 唐谊 <ss...@gmail.com> on 2019/09/03 05:12:32 UTC

[DISCUSS] PIP: Producer Send Message with Different Schema

Hi all;

I am drafting a proposal to support the producer to send messages with
different schema.

## Motivation
For now, Pulsar producer can only produce messages of one type of schema
which is determined by user when it is created, or by fecthing the latest
version of schema from registry if AUTO_PRODUCE_BYTES type is specified.
Schema, however, can be updated by external system after producer started,
which would lead to inconsistency between messsage payload and schema
version metadata. Also some senarios like replicating from kafka require a
single producer for replicating messages of different schemas from one
Kafka partition to one Pulsar partition to guarantee the order and no
duplicates.

Here proposing that messages can indicate the associated schema by itself,
for more detail,
https://gist.github.com/yittg/56c6dedf7509f634ec7effc4f6f3631d#file-pip-md

Looking forward to any feedback.

Thanks,
Yi

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by Yi Tang <ss...@gmail.com>.
Hi Raman,

Sincerely apologize for misspelling your name, sorry.

Yi

Yi Tang <ss...@gmail.com> 于 2019年9月16日周一 09:32写道:

> Hi rarma,
>
> It's a great and important feature, I think. This PIP requires the
> compatibility check from bottom registry only and doesn't touch the
> implementation detail. I think we should address this feature in the
> future, and this PIP provides the essential ability to implement it.
>
> Thanks,
> Yi
>
> rocketraman@gmail.com <ro...@gmail.com> 于 2019年9月15日周日 22:36写道:
>
>> I see a mention of compatibility in the PIP but with no details.  The
>> docs about schema compatibility state this:
>>
>> > Consequently, those events need to go in the same Pulsar partition to
>> maintain order. This application can use ALWAYS_COMPATIBLE to allow
>> different kinds of events co-exist in the same topic.
>>
>> With this PIP, this limitation can be relaxed, and schema compatibility
>> should be able to be strengthened, since each type of message on a topic
>> can have its own schema, and compatibility can then be checked against only
>> other schemas for the same type. Kafka does this via the concept of
>> "subjects" in the schema registry, and subjects default to just the topic
>> name (plus a "-key" or "-value" suffix since keys and values can both have
>> their own schemas), but can also include (via an injectable strategy) the
>> message type. Compatibility is managed at the subject level.
>>
>> Is this something that should be addressed in this PIP, or in future
>> follow-on work? This is critical to supporting ordering across different
>> message types, with schema compatibility verification by Pulsar.
>>
>> Regards,
>> Raman
>>
>>
>>
>> On 2019/09/03 05:12:32, 唐谊 <ss...@gmail.com> wrote:
>> > Hi all;
>> >
>> > I am drafting a proposal to support the producer to send messages with
>> > different schema.
>> >
>> > ## Motivation
>> > For now, Pulsar producer can only produce messages of one type of schema
>> > which is determined by user when it is created, or by fecthing the
>> latest
>> > version of schema from registry if AUTO_PRODUCE_BYTES type is specified.
>> > Schema, however, can be updated by external system after producer
>> started,
>> > which would lead to inconsistency between messsage payload and schema
>> > version metadata. Also some senarios like replicating from kafka
>> require a
>> > single producer for replicating messages of different schemas from one
>> > Kafka partition to one Pulsar partition to guarantee the order and no
>> > duplicates.
>> >
>> > Here proposing that messages can indicate the associated schema by
>> itself,
>> > for more detail,
>> >
>> https://gist.github.com/yittg/56c6dedf7509f634ec7effc4f6f3631d#file-pip-md
>> >
>> > Looking forward to any feedback.
>> >
>> > Thanks,
>> > Yi
>> >
>>
>

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by ro...@gmail.com, ro...@gmail.com.
> On Wed, Apr 15, 2020 at 4:25 AM Shivji Kumar Jha <sh...@gmail.com> wrote:
> 
> > Hi Sijie,
> >
> > I second with Raman. Apart from PIP-43 and PIP-44 which ease schema
> > management, in my opinion, we should also loosely couple the association
> > between topic and schema (or more precisely *type of data* on topic) which
> > is 1 to 1 as of now.

That's it exactly.

On 2020/04/15 22:23:03, Sijie Guo <gu...@gmail.com> wrote: 
> I see. I wasn't sure that Raman is looking for this capability based on his
> previous email.

Sorry for the confusion. You would have had to look further back in the email thread -- in a previous message I describe Kafka's decoupling of topic/schema association via the "subject" concept, and the capability this provides in terms of tracking compatibility for multiple message types on one topic. Thanks Shivji for picking this up -- I look forward to reading your PIP....

Regards,
Raman


> 
> I do agree that decoupling the relationship between topic and schema can
> drive more use cases. It is a great feature to add.
> 
> We will pick this up and come up a PIP for introducing this capability.
> 
> Thanks,
> Sijie
> 
> On Wed, Apr 15, 2020 at 4:25 AM Shivji Kumar Jha <sh...@gmail.com> wrote:
> 
> > Hi Sijie,
> >
> > I second with Raman. Apart from PIP-43 and PIP-44 which ease schema
> > management, in my opinion, we should also loosely couple the association
> > between topic and schema (or more precisely *type of data* on topic) which
> > is 1 to 1 as of now.
> >
> >    1. The schema (or schema versions of one data type) could be grouped
> >    into what Kafka calls *subject*.
> >    2. The schema compatibility should then be done among schemas in the
> >    same subject only.
> >    3. One topic can associate with multiple schema subjects and have their
> >    own evolution paths.
> >    4. Similarly, one subject can also associate to multiple topics.
> >
> > *Use case:*
> > This feature would be handy when one needs different business models in a
> > strictly ordered fashion. At the same time, these business models have
> > their own evolution paths too. As an example, an event sourcing system
> > could have events like customerCreated, customerAddressChanged,
> > customerInvoicePaid events etc required in order.
> >
> > The ideas presented above are picked from here
> > <https://martin.kleppmann.com/2018/01/18/event-types-in-kafka-topic.html>.
> >
> > Regards,
> > Shivji Kumar Jha
> > http://www.shivjijha.com/
> > +91 8884075512
> >
> >
> > On Wed, Apr 15, 2020 at 2:27 AM Sijie Guo <gu...@gmail.com> wrote:
> >
> > > Hi Raman,
> > >
> > > The schema compatibility strategies were already there prior to PIP-43.
> > >
> > > PIP-44 enhances the schema compatibility strategy support.
> > >
> > > Both of the changes are already landed in 2.5.0 release.
> > >
> > > Did you see any issues when you tryout this feature?
> > >
> > > - Sijie
> > >
> > > On Tue, Apr 14, 2020 at 8:35 AM rocketraman@gmail.com <
> > > rocketraman@gmail.com>
> > > wrote:
> > >
> > > > Now that PIP-43 is released in 2.5.0, I wanted to follow up on the
> > > > messages below.
> > > >
> > > > What is remaining to be done in Pulsar to support having multiple
> > > > different types on one topic in Pulsar? Yi indicates below that PIP-43
> > > sets
> > > > the stage for this, but that the schema compatibility implementation
> > > still
> > > > would need some work.
> > > >
> > > > Would this require another PIP, or just an issue to track the work?
> > > >
> > > > Regards,
> > > > Raman
> > > >
> > > > On 2019/09/16 01:32:39, Yi Tang <ss...@gmail.com> wrote:
> > > > > Hi rarma,
> > > > >
> > > > > It's a great and important feature, I think. This PIP requires the
> > > > > compatibility check from bottom registry only and doesn't touch the
> > > > > implementation detail. I think we should address this feature in the
> > > > > future, and this PIP provides the essential ability to implement it.
> > > > >
> > > > > Thanks,
> > > > > Yi
> > > > >
> > > > > rocketraman@gmail.com <ro...@gmail.com> 于 2019年9月15日周日
> > 22:36写道:
> > > > >
> > > > > > I see a mention of compatibility in the PIP but with no details.
> > The
> > > > docs
> > > > > > about schema compatibility state this:
> > > > > >
> > > > > > > Consequently, those events need to go in the same Pulsar
> > partition
> > > to
> > > > > > maintain order. This application can use ALWAYS_COMPATIBLE to allow
> > > > > > different kinds of events co-exist in the same topic.
> > > > > >
> > > > > > With this PIP, this limitation can be relaxed, and schema
> > > compatibility
> > > > > > should be able to be strengthened, since each type of message on a
> > > > topic
> > > > > > can have its own schema, and compatibility can then be checked
> > > against
> > > > only
> > > > > > other schemas for the same type. Kafka does this via the concept of
> > > > > > "subjects" in the schema registry, and subjects default to just the
> > > > topic
> > > > > > name (plus a "-key" or "-value" suffix since keys and values can
> > both
> > > > have
> > > > > > their own schemas), but can also include (via an injectable
> > strategy)
> > > > the
> > > > > > message type. Compatibility is managed at the subject level.
> > > > > >
> > > > > > Is this something that should be addressed in this PIP, or in
> > future
> > > > > > follow-on work? This is critical to supporting ordering across
> > > > different
> > > > > > message types, with schema compatibility verification by Pulsar.
> > > > > >
> > > > > > Regards,
> > > > > > Raman
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 2019/09/03 05:12:32, 唐谊 <ss...@gmail.com> wrote:
> > > > > > > Hi all;
> > > > > > >
> > > > > > > I am drafting a proposal to support the producer to send messages
> > > > with
> > > > > > > different schema.
> > > > > > >
> > > > > > > ## Motivation
> > > > > > > For now, Pulsar producer can only produce messages of one type of
> > > > schema
> > > > > > > which is determined by user when it is created, or by fecthing
> > the
> > > > latest
> > > > > > > version of schema from registry if AUTO_PRODUCE_BYTES type is
> > > > specified.
> > > > > > > Schema, however, can be updated by external system after producer
> > > > > > started,
> > > > > > > which would lead to inconsistency between messsage payload and
> > > schema
> > > > > > > version metadata. Also some senarios like replicating from kafka
> > > > require
> > > > > > a
> > > > > > > single producer for replicating messages of different schemas
> > from
> > > > one
> > > > > > > Kafka partition to one Pulsar partition to guarantee the order
> > and
> > > no
> > > > > > > duplicates.
> > > > > > >
> > > > > > > Here proposing that messages can indicate the associated schema
> > by
> > > > > > itself,
> > > > > > > for more detail,
> > > > > > >
> > > > > >
> > > >
> > >
> > https://gist.github.com/yittg/56c6dedf7509f634ec7effc4f6f3631d#file-pip-md
> > > > > > >
> > > > > > > Looking forward to any feedback.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Yi
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by Shivji Kumar Jha <sh...@gmail.com>.
Sure Sijie.

Regards,
Shivji Kumar Jha
http://www.shivjijha.com/
+91 8884075512


On Thu, Apr 16, 2020 at 2:22 PM Sijie Guo <gu...@gmail.com> wrote:

> Yeah!
>
> I don't think there is anyone picking this up yet. You are very welcome to
> contribute to this feature. Can you start putting up a PIP for it?
>
> Thanks,
> Sijie
>
> On Wed, Apr 15, 2020 at 9:35 PM Shivji Kumar Jha <sh...@gmail.com>
> wrote:
>
> > Hi Sijie,
> >
> > If no one has picked this up, I would like to volunteer for this feature.
> >
> > Regards,
> > Shivji Kumar Jha
> > http://www.shivjijha.com/
> > +91 8884075512
> >
> >
> > On Thu, Apr 16, 2020 at 3:53 AM Sijie Guo <gu...@gmail.com> wrote:
> >
> > > I see. I wasn't sure that Raman is looking for this capability based on
> > his
> > > previous email.
> > >
> > > I do agree that decoupling the relationship between topic and schema
> can
> > > drive more use cases. It is a great feature to add.
> > >
> > > We will pick this up and come up a PIP for introducing this capability.
> > >
> > > Thanks,
> > > Sijie
> > >
> > > On Wed, Apr 15, 2020 at 4:25 AM Shivji Kumar Jha <sh...@gmail.com>
> > > wrote:
> > >
> > > > Hi Sijie,
> > > >
> > > > I second with Raman. Apart from PIP-43 and PIP-44 which ease schema
> > > > management, in my opinion, we should also loosely couple the
> > association
> > > > between topic and schema (or more precisely *type of data* on topic)
> > > which
> > > > is 1 to 1 as of now.
> > > >
> > > >    1. The schema (or schema versions of one data type) could be
> grouped
> > > >    into what Kafka calls *subject*.
> > > >    2. The schema compatibility should then be done among schemas in
> the
> > > >    same subject only.
> > > >    3. One topic can associate with multiple schema subjects and have
> > > their
> > > >    own evolution paths.
> > > >    4. Similarly, one subject can also associate to multiple topics.
> > > >
> > > > *Use case:*
> > > > This feature would be handy when one needs different business models
> > in a
> > > > strictly ordered fashion. At the same time, these business models
> have
> > > > their own evolution paths too. As an example, an event sourcing
> system
> > > > could have events like customerCreated, customerAddressChanged,
> > > > customerInvoicePaid events etc required in order.
> > > >
> > > > The ideas presented above are picked from here
> > > > <
> > https://martin.kleppmann.com/2018/01/18/event-types-in-kafka-topic.html
> > > >.
> > > >
> > > > Regards,
> > > > Shivji Kumar Jha
> > > > http://www.shivjijha.com/
> > > > +91 8884075512
> > > >
> > > >
> > > > On Wed, Apr 15, 2020 at 2:27 AM Sijie Guo <gu...@gmail.com>
> wrote:
> > > >
> > > > > Hi Raman,
> > > > >
> > > > > The schema compatibility strategies were already there prior to
> > PIP-43.
> > > > >
> > > > > PIP-44 enhances the schema compatibility strategy support.
> > > > >
> > > > > Both of the changes are already landed in 2.5.0 release.
> > > > >
> > > > > Did you see any issues when you tryout this feature?
> > > > >
> > > > > - Sijie
> > > > >
> > > > > On Tue, Apr 14, 2020 at 8:35 AM rocketraman@gmail.com <
> > > > > rocketraman@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Now that PIP-43 is released in 2.5.0, I wanted to follow up on
> the
> > > > > > messages below.
> > > > > >
> > > > > > What is remaining to be done in Pulsar to support having multiple
> > > > > > different types on one topic in Pulsar? Yi indicates below that
> > > PIP-43
> > > > > sets
> > > > > > the stage for this, but that the schema compatibility
> > implementation
> > > > > still
> > > > > > would need some work.
> > > > > >
> > > > > > Would this require another PIP, or just an issue to track the
> work?
> > > > > >
> > > > > > Regards,
> > > > > > Raman
> > > > > >
> > > > > > On 2019/09/16 01:32:39, Yi Tang <ss...@gmail.com> wrote:
> > > > > > > Hi rarma,
> > > > > > >
> > > > > > > It's a great and important feature, I think. This PIP requires
> > the
> > > > > > > compatibility check from bottom registry only and doesn't touch
> > the
> > > > > > > implementation detail. I think we should address this feature
> in
> > > the
> > > > > > > future, and this PIP provides the essential ability to
> implement
> > > it.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Yi
> > > > > > >
> > > > > > > rocketraman@gmail.com <ro...@gmail.com> 于 2019年9月15日周日
> > > > 22:36写道:
> > > > > > >
> > > > > > > > I see a mention of compatibility in the PIP but with no
> > details.
> > > > The
> > > > > > docs
> > > > > > > > about schema compatibility state this:
> > > > > > > >
> > > > > > > > > Consequently, those events need to go in the same Pulsar
> > > > partition
> > > > > to
> > > > > > > > maintain order. This application can use ALWAYS_COMPATIBLE to
> > > allow
> > > > > > > > different kinds of events co-exist in the same topic.
> > > > > > > >
> > > > > > > > With this PIP, this limitation can be relaxed, and schema
> > > > > compatibility
> > > > > > > > should be able to be strengthened, since each type of message
> > on
> > > a
> > > > > > topic
> > > > > > > > can have its own schema, and compatibility can then be
> checked
> > > > > against
> > > > > > only
> > > > > > > > other schemas for the same type. Kafka does this via the
> > concept
> > > of
> > > > > > > > "subjects" in the schema registry, and subjects default to
> just
> > > the
> > > > > > topic
> > > > > > > > name (plus a "-key" or "-value" suffix since keys and values
> > can
> > > > both
> > > > > > have
> > > > > > > > their own schemas), but can also include (via an injectable
> > > > strategy)
> > > > > > the
> > > > > > > > message type. Compatibility is managed at the subject level.
> > > > > > > >
> > > > > > > > Is this something that should be addressed in this PIP, or in
> > > > future
> > > > > > > > follow-on work? This is critical to supporting ordering
> across
> > > > > > different
> > > > > > > > message types, with schema compatibility verification by
> > Pulsar.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Raman
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On 2019/09/03 05:12:32, 唐谊 <ss...@gmail.com> wrote:
> > > > > > > > > Hi all;
> > > > > > > > >
> > > > > > > > > I am drafting a proposal to support the producer to send
> > > messages
> > > > > > with
> > > > > > > > > different schema.
> > > > > > > > >
> > > > > > > > > ## Motivation
> > > > > > > > > For now, Pulsar producer can only produce messages of one
> > type
> > > of
> > > > > > schema
> > > > > > > > > which is determined by user when it is created, or by
> > fecthing
> > > > the
> > > > > > latest
> > > > > > > > > version of schema from registry if AUTO_PRODUCE_BYTES type
> is
> > > > > > specified.
> > > > > > > > > Schema, however, can be updated by external system after
> > > producer
> > > > > > > > started,
> > > > > > > > > which would lead to inconsistency between messsage payload
> > and
> > > > > schema
> > > > > > > > > version metadata. Also some senarios like replicating from
> > > kafka
> > > > > > require
> > > > > > > > a
> > > > > > > > > single producer for replicating messages of different
> schemas
> > > > from
> > > > > > one
> > > > > > > > > Kafka partition to one Pulsar partition to guarantee the
> > order
> > > > and
> > > > > no
> > > > > > > > > duplicates.
> > > > > > > > >
> > > > > > > > > Here proposing that messages can indicate the associated
> > schema
> > > > by
> > > > > > > > itself,
> > > > > > > > > for more detail,
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://gist.github.com/yittg/56c6dedf7509f634ec7effc4f6f3631d#file-pip-md
> > > > > > > > >
> > > > > > > > > Looking forward to any feedback.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Yi
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by Sijie Guo <gu...@gmail.com>.
Yeah!

I don't think there is anyone picking this up yet. You are very welcome to
contribute to this feature. Can you start putting up a PIP for it?

Thanks,
Sijie

On Wed, Apr 15, 2020 at 9:35 PM Shivji Kumar Jha <sh...@gmail.com> wrote:

> Hi Sijie,
>
> If no one has picked this up, I would like to volunteer for this feature.
>
> Regards,
> Shivji Kumar Jha
> http://www.shivjijha.com/
> +91 8884075512
>
>
> On Thu, Apr 16, 2020 at 3:53 AM Sijie Guo <gu...@gmail.com> wrote:
>
> > I see. I wasn't sure that Raman is looking for this capability based on
> his
> > previous email.
> >
> > I do agree that decoupling the relationship between topic and schema can
> > drive more use cases. It is a great feature to add.
> >
> > We will pick this up and come up a PIP for introducing this capability.
> >
> > Thanks,
> > Sijie
> >
> > On Wed, Apr 15, 2020 at 4:25 AM Shivji Kumar Jha <sh...@gmail.com>
> > wrote:
> >
> > > Hi Sijie,
> > >
> > > I second with Raman. Apart from PIP-43 and PIP-44 which ease schema
> > > management, in my opinion, we should also loosely couple the
> association
> > > between topic and schema (or more precisely *type of data* on topic)
> > which
> > > is 1 to 1 as of now.
> > >
> > >    1. The schema (or schema versions of one data type) could be grouped
> > >    into what Kafka calls *subject*.
> > >    2. The schema compatibility should then be done among schemas in the
> > >    same subject only.
> > >    3. One topic can associate with multiple schema subjects and have
> > their
> > >    own evolution paths.
> > >    4. Similarly, one subject can also associate to multiple topics.
> > >
> > > *Use case:*
> > > This feature would be handy when one needs different business models
> in a
> > > strictly ordered fashion. At the same time, these business models have
> > > their own evolution paths too. As an example, an event sourcing system
> > > could have events like customerCreated, customerAddressChanged,
> > > customerInvoicePaid events etc required in order.
> > >
> > > The ideas presented above are picked from here
> > > <
> https://martin.kleppmann.com/2018/01/18/event-types-in-kafka-topic.html
> > >.
> > >
> > > Regards,
> > > Shivji Kumar Jha
> > > http://www.shivjijha.com/
> > > +91 8884075512
> > >
> > >
> > > On Wed, Apr 15, 2020 at 2:27 AM Sijie Guo <gu...@gmail.com> wrote:
> > >
> > > > Hi Raman,
> > > >
> > > > The schema compatibility strategies were already there prior to
> PIP-43.
> > > >
> > > > PIP-44 enhances the schema compatibility strategy support.
> > > >
> > > > Both of the changes are already landed in 2.5.0 release.
> > > >
> > > > Did you see any issues when you tryout this feature?
> > > >
> > > > - Sijie
> > > >
> > > > On Tue, Apr 14, 2020 at 8:35 AM rocketraman@gmail.com <
> > > > rocketraman@gmail.com>
> > > > wrote:
> > > >
> > > > > Now that PIP-43 is released in 2.5.0, I wanted to follow up on the
> > > > > messages below.
> > > > >
> > > > > What is remaining to be done in Pulsar to support having multiple
> > > > > different types on one topic in Pulsar? Yi indicates below that
> > PIP-43
> > > > sets
> > > > > the stage for this, but that the schema compatibility
> implementation
> > > > still
> > > > > would need some work.
> > > > >
> > > > > Would this require another PIP, or just an issue to track the work?
> > > > >
> > > > > Regards,
> > > > > Raman
> > > > >
> > > > > On 2019/09/16 01:32:39, Yi Tang <ss...@gmail.com> wrote:
> > > > > > Hi rarma,
> > > > > >
> > > > > > It's a great and important feature, I think. This PIP requires
> the
> > > > > > compatibility check from bottom registry only and doesn't touch
> the
> > > > > > implementation detail. I think we should address this feature in
> > the
> > > > > > future, and this PIP provides the essential ability to implement
> > it.
> > > > > >
> > > > > > Thanks,
> > > > > > Yi
> > > > > >
> > > > > > rocketraman@gmail.com <ro...@gmail.com> 于 2019年9月15日周日
> > > 22:36写道:
> > > > > >
> > > > > > > I see a mention of compatibility in the PIP but with no
> details.
> > > The
> > > > > docs
> > > > > > > about schema compatibility state this:
> > > > > > >
> > > > > > > > Consequently, those events need to go in the same Pulsar
> > > partition
> > > > to
> > > > > > > maintain order. This application can use ALWAYS_COMPATIBLE to
> > allow
> > > > > > > different kinds of events co-exist in the same topic.
> > > > > > >
> > > > > > > With this PIP, this limitation can be relaxed, and schema
> > > > compatibility
> > > > > > > should be able to be strengthened, since each type of message
> on
> > a
> > > > > topic
> > > > > > > can have its own schema, and compatibility can then be checked
> > > > against
> > > > > only
> > > > > > > other schemas for the same type. Kafka does this via the
> concept
> > of
> > > > > > > "subjects" in the schema registry, and subjects default to just
> > the
> > > > > topic
> > > > > > > name (plus a "-key" or "-value" suffix since keys and values
> can
> > > both
> > > > > have
> > > > > > > their own schemas), but can also include (via an injectable
> > > strategy)
> > > > > the
> > > > > > > message type. Compatibility is managed at the subject level.
> > > > > > >
> > > > > > > Is this something that should be addressed in this PIP, or in
> > > future
> > > > > > > follow-on work? This is critical to supporting ordering across
> > > > > different
> > > > > > > message types, with schema compatibility verification by
> Pulsar.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Raman
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On 2019/09/03 05:12:32, 唐谊 <ss...@gmail.com> wrote:
> > > > > > > > Hi all;
> > > > > > > >
> > > > > > > > I am drafting a proposal to support the producer to send
> > messages
> > > > > with
> > > > > > > > different schema.
> > > > > > > >
> > > > > > > > ## Motivation
> > > > > > > > For now, Pulsar producer can only produce messages of one
> type
> > of
> > > > > schema
> > > > > > > > which is determined by user when it is created, or by
> fecthing
> > > the
> > > > > latest
> > > > > > > > version of schema from registry if AUTO_PRODUCE_BYTES type is
> > > > > specified.
> > > > > > > > Schema, however, can be updated by external system after
> > producer
> > > > > > > started,
> > > > > > > > which would lead to inconsistency between messsage payload
> and
> > > > schema
> > > > > > > > version metadata. Also some senarios like replicating from
> > kafka
> > > > > require
> > > > > > > a
> > > > > > > > single producer for replicating messages of different schemas
> > > from
> > > > > one
> > > > > > > > Kafka partition to one Pulsar partition to guarantee the
> order
> > > and
> > > > no
> > > > > > > > duplicates.
> > > > > > > >
> > > > > > > > Here proposing that messages can indicate the associated
> schema
> > > by
> > > > > > > itself,
> > > > > > > > for more detail,
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
> https://gist.github.com/yittg/56c6dedf7509f634ec7effc4f6f3631d#file-pip-md
> > > > > > > >
> > > > > > > > Looking forward to any feedback.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Yi
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by Shivji Kumar Jha <sh...@gmail.com>.
Hi Sijie,

If no one has picked this up, I would like to volunteer for this feature.

Regards,
Shivji Kumar Jha
http://www.shivjijha.com/
+91 8884075512


On Thu, Apr 16, 2020 at 3:53 AM Sijie Guo <gu...@gmail.com> wrote:

> I see. I wasn't sure that Raman is looking for this capability based on his
> previous email.
>
> I do agree that decoupling the relationship between topic and schema can
> drive more use cases. It is a great feature to add.
>
> We will pick this up and come up a PIP for introducing this capability.
>
> Thanks,
> Sijie
>
> On Wed, Apr 15, 2020 at 4:25 AM Shivji Kumar Jha <sh...@gmail.com>
> wrote:
>
> > Hi Sijie,
> >
> > I second with Raman. Apart from PIP-43 and PIP-44 which ease schema
> > management, in my opinion, we should also loosely couple the association
> > between topic and schema (or more precisely *type of data* on topic)
> which
> > is 1 to 1 as of now.
> >
> >    1. The schema (or schema versions of one data type) could be grouped
> >    into what Kafka calls *subject*.
> >    2. The schema compatibility should then be done among schemas in the
> >    same subject only.
> >    3. One topic can associate with multiple schema subjects and have
> their
> >    own evolution paths.
> >    4. Similarly, one subject can also associate to multiple topics.
> >
> > *Use case:*
> > This feature would be handy when one needs different business models in a
> > strictly ordered fashion. At the same time, these business models have
> > their own evolution paths too. As an example, an event sourcing system
> > could have events like customerCreated, customerAddressChanged,
> > customerInvoicePaid events etc required in order.
> >
> > The ideas presented above are picked from here
> > <https://martin.kleppmann.com/2018/01/18/event-types-in-kafka-topic.html
> >.
> >
> > Regards,
> > Shivji Kumar Jha
> > http://www.shivjijha.com/
> > +91 8884075512
> >
> >
> > On Wed, Apr 15, 2020 at 2:27 AM Sijie Guo <gu...@gmail.com> wrote:
> >
> > > Hi Raman,
> > >
> > > The schema compatibility strategies were already there prior to PIP-43.
> > >
> > > PIP-44 enhances the schema compatibility strategy support.
> > >
> > > Both of the changes are already landed in 2.5.0 release.
> > >
> > > Did you see any issues when you tryout this feature?
> > >
> > > - Sijie
> > >
> > > On Tue, Apr 14, 2020 at 8:35 AM rocketraman@gmail.com <
> > > rocketraman@gmail.com>
> > > wrote:
> > >
> > > > Now that PIP-43 is released in 2.5.0, I wanted to follow up on the
> > > > messages below.
> > > >
> > > > What is remaining to be done in Pulsar to support having multiple
> > > > different types on one topic in Pulsar? Yi indicates below that
> PIP-43
> > > sets
> > > > the stage for this, but that the schema compatibility implementation
> > > still
> > > > would need some work.
> > > >
> > > > Would this require another PIP, or just an issue to track the work?
> > > >
> > > > Regards,
> > > > Raman
> > > >
> > > > On 2019/09/16 01:32:39, Yi Tang <ss...@gmail.com> wrote:
> > > > > Hi rarma,
> > > > >
> > > > > It's a great and important feature, I think. This PIP requires the
> > > > > compatibility check from bottom registry only and doesn't touch the
> > > > > implementation detail. I think we should address this feature in
> the
> > > > > future, and this PIP provides the essential ability to implement
> it.
> > > > >
> > > > > Thanks,
> > > > > Yi
> > > > >
> > > > > rocketraman@gmail.com <ro...@gmail.com> 于 2019年9月15日周日
> > 22:36写道:
> > > > >
> > > > > > I see a mention of compatibility in the PIP but with no details.
> > The
> > > > docs
> > > > > > about schema compatibility state this:
> > > > > >
> > > > > > > Consequently, those events need to go in the same Pulsar
> > partition
> > > to
> > > > > > maintain order. This application can use ALWAYS_COMPATIBLE to
> allow
> > > > > > different kinds of events co-exist in the same topic.
> > > > > >
> > > > > > With this PIP, this limitation can be relaxed, and schema
> > > compatibility
> > > > > > should be able to be strengthened, since each type of message on
> a
> > > > topic
> > > > > > can have its own schema, and compatibility can then be checked
> > > against
> > > > only
> > > > > > other schemas for the same type. Kafka does this via the concept
> of
> > > > > > "subjects" in the schema registry, and subjects default to just
> the
> > > > topic
> > > > > > name (plus a "-key" or "-value" suffix since keys and values can
> > both
> > > > have
> > > > > > their own schemas), but can also include (via an injectable
> > strategy)
> > > > the
> > > > > > message type. Compatibility is managed at the subject level.
> > > > > >
> > > > > > Is this something that should be addressed in this PIP, or in
> > future
> > > > > > follow-on work? This is critical to supporting ordering across
> > > > different
> > > > > > message types, with schema compatibility verification by Pulsar.
> > > > > >
> > > > > > Regards,
> > > > > > Raman
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 2019/09/03 05:12:32, 唐谊 <ss...@gmail.com> wrote:
> > > > > > > Hi all;
> > > > > > >
> > > > > > > I am drafting a proposal to support the producer to send
> messages
> > > > with
> > > > > > > different schema.
> > > > > > >
> > > > > > > ## Motivation
> > > > > > > For now, Pulsar producer can only produce messages of one type
> of
> > > > schema
> > > > > > > which is determined by user when it is created, or by fecthing
> > the
> > > > latest
> > > > > > > version of schema from registry if AUTO_PRODUCE_BYTES type is
> > > > specified.
> > > > > > > Schema, however, can be updated by external system after
> producer
> > > > > > started,
> > > > > > > which would lead to inconsistency between messsage payload and
> > > schema
> > > > > > > version metadata. Also some senarios like replicating from
> kafka
> > > > require
> > > > > > a
> > > > > > > single producer for replicating messages of different schemas
> > from
> > > > one
> > > > > > > Kafka partition to one Pulsar partition to guarantee the order
> > and
> > > no
> > > > > > > duplicates.
> > > > > > >
> > > > > > > Here proposing that messages can indicate the associated schema
> > by
> > > > > > itself,
> > > > > > > for more detail,
> > > > > > >
> > > > > >
> > > >
> > >
> >
> https://gist.github.com/yittg/56c6dedf7509f634ec7effc4f6f3631d#file-pip-md
> > > > > > >
> > > > > > > Looking forward to any feedback.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Yi
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by Sijie Guo <gu...@gmail.com>.
I see. I wasn't sure that Raman is looking for this capability based on his
previous email.

I do agree that decoupling the relationship between topic and schema can
drive more use cases. It is a great feature to add.

We will pick this up and come up a PIP for introducing this capability.

Thanks,
Sijie

On Wed, Apr 15, 2020 at 4:25 AM Shivji Kumar Jha <sh...@gmail.com> wrote:

> Hi Sijie,
>
> I second with Raman. Apart from PIP-43 and PIP-44 which ease schema
> management, in my opinion, we should also loosely couple the association
> between topic and schema (or more precisely *type of data* on topic) which
> is 1 to 1 as of now.
>
>    1. The schema (or schema versions of one data type) could be grouped
>    into what Kafka calls *subject*.
>    2. The schema compatibility should then be done among schemas in the
>    same subject only.
>    3. One topic can associate with multiple schema subjects and have their
>    own evolution paths.
>    4. Similarly, one subject can also associate to multiple topics.
>
> *Use case:*
> This feature would be handy when one needs different business models in a
> strictly ordered fashion. At the same time, these business models have
> their own evolution paths too. As an example, an event sourcing system
> could have events like customerCreated, customerAddressChanged,
> customerInvoicePaid events etc required in order.
>
> The ideas presented above are picked from here
> <https://martin.kleppmann.com/2018/01/18/event-types-in-kafka-topic.html>.
>
> Regards,
> Shivji Kumar Jha
> http://www.shivjijha.com/
> +91 8884075512
>
>
> On Wed, Apr 15, 2020 at 2:27 AM Sijie Guo <gu...@gmail.com> wrote:
>
> > Hi Raman,
> >
> > The schema compatibility strategies were already there prior to PIP-43.
> >
> > PIP-44 enhances the schema compatibility strategy support.
> >
> > Both of the changes are already landed in 2.5.0 release.
> >
> > Did you see any issues when you tryout this feature?
> >
> > - Sijie
> >
> > On Tue, Apr 14, 2020 at 8:35 AM rocketraman@gmail.com <
> > rocketraman@gmail.com>
> > wrote:
> >
> > > Now that PIP-43 is released in 2.5.0, I wanted to follow up on the
> > > messages below.
> > >
> > > What is remaining to be done in Pulsar to support having multiple
> > > different types on one topic in Pulsar? Yi indicates below that PIP-43
> > sets
> > > the stage for this, but that the schema compatibility implementation
> > still
> > > would need some work.
> > >
> > > Would this require another PIP, or just an issue to track the work?
> > >
> > > Regards,
> > > Raman
> > >
> > > On 2019/09/16 01:32:39, Yi Tang <ss...@gmail.com> wrote:
> > > > Hi rarma,
> > > >
> > > > It's a great and important feature, I think. This PIP requires the
> > > > compatibility check from bottom registry only and doesn't touch the
> > > > implementation detail. I think we should address this feature in the
> > > > future, and this PIP provides the essential ability to implement it.
> > > >
> > > > Thanks,
> > > > Yi
> > > >
> > > > rocketraman@gmail.com <ro...@gmail.com> 于 2019年9月15日周日
> 22:36写道:
> > > >
> > > > > I see a mention of compatibility in the PIP but with no details.
> The
> > > docs
> > > > > about schema compatibility state this:
> > > > >
> > > > > > Consequently, those events need to go in the same Pulsar
> partition
> > to
> > > > > maintain order. This application can use ALWAYS_COMPATIBLE to allow
> > > > > different kinds of events co-exist in the same topic.
> > > > >
> > > > > With this PIP, this limitation can be relaxed, and schema
> > compatibility
> > > > > should be able to be strengthened, since each type of message on a
> > > topic
> > > > > can have its own schema, and compatibility can then be checked
> > against
> > > only
> > > > > other schemas for the same type. Kafka does this via the concept of
> > > > > "subjects" in the schema registry, and subjects default to just the
> > > topic
> > > > > name (plus a "-key" or "-value" suffix since keys and values can
> both
> > > have
> > > > > their own schemas), but can also include (via an injectable
> strategy)
> > > the
> > > > > message type. Compatibility is managed at the subject level.
> > > > >
> > > > > Is this something that should be addressed in this PIP, or in
> future
> > > > > follow-on work? This is critical to supporting ordering across
> > > different
> > > > > message types, with schema compatibility verification by Pulsar.
> > > > >
> > > > > Regards,
> > > > > Raman
> > > > >
> > > > >
> > > > >
> > > > > On 2019/09/03 05:12:32, 唐谊 <ss...@gmail.com> wrote:
> > > > > > Hi all;
> > > > > >
> > > > > > I am drafting a proposal to support the producer to send messages
> > > with
> > > > > > different schema.
> > > > > >
> > > > > > ## Motivation
> > > > > > For now, Pulsar producer can only produce messages of one type of
> > > schema
> > > > > > which is determined by user when it is created, or by fecthing
> the
> > > latest
> > > > > > version of schema from registry if AUTO_PRODUCE_BYTES type is
> > > specified.
> > > > > > Schema, however, can be updated by external system after producer
> > > > > started,
> > > > > > which would lead to inconsistency between messsage payload and
> > schema
> > > > > > version metadata. Also some senarios like replicating from kafka
> > > require
> > > > > a
> > > > > > single producer for replicating messages of different schemas
> from
> > > one
> > > > > > Kafka partition to one Pulsar partition to guarantee the order
> and
> > no
> > > > > > duplicates.
> > > > > >
> > > > > > Here proposing that messages can indicate the associated schema
> by
> > > > > itself,
> > > > > > for more detail,
> > > > > >
> > > > >
> > >
> >
> https://gist.github.com/yittg/56c6dedf7509f634ec7effc4f6f3631d#file-pip-md
> > > > > >
> > > > > > Looking forward to any feedback.
> > > > > >
> > > > > > Thanks,
> > > > > > Yi
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by Shivji Kumar Jha <sh...@gmail.com>.
Hi Sijie,

I second with Raman. Apart from PIP-43 and PIP-44 which ease schema
management, in my opinion, we should also loosely couple the association
between topic and schema (or more precisely *type of data* on topic) which
is 1 to 1 as of now.

   1. The schema (or schema versions of one data type) could be grouped
   into what Kafka calls *subject*.
   2. The schema compatibility should then be done among schemas in the
   same subject only.
   3. One topic can associate with multiple schema subjects and have their
   own evolution paths.
   4. Similarly, one subject can also associate to multiple topics.

*Use case:*
This feature would be handy when one needs different business models in a
strictly ordered fashion. At the same time, these business models have
their own evolution paths too. As an example, an event sourcing system
could have events like customerCreated, customerAddressChanged,
customerInvoicePaid events etc required in order.

The ideas presented above are picked from here
<https://martin.kleppmann.com/2018/01/18/event-types-in-kafka-topic.html>.

Regards,
Shivji Kumar Jha
http://www.shivjijha.com/
+91 8884075512


On Wed, Apr 15, 2020 at 2:27 AM Sijie Guo <gu...@gmail.com> wrote:

> Hi Raman,
>
> The schema compatibility strategies were already there prior to PIP-43.
>
> PIP-44 enhances the schema compatibility strategy support.
>
> Both of the changes are already landed in 2.5.0 release.
>
> Did you see any issues when you tryout this feature?
>
> - Sijie
>
> On Tue, Apr 14, 2020 at 8:35 AM rocketraman@gmail.com <
> rocketraman@gmail.com>
> wrote:
>
> > Now that PIP-43 is released in 2.5.0, I wanted to follow up on the
> > messages below.
> >
> > What is remaining to be done in Pulsar to support having multiple
> > different types on one topic in Pulsar? Yi indicates below that PIP-43
> sets
> > the stage for this, but that the schema compatibility implementation
> still
> > would need some work.
> >
> > Would this require another PIP, or just an issue to track the work?
> >
> > Regards,
> > Raman
> >
> > On 2019/09/16 01:32:39, Yi Tang <ss...@gmail.com> wrote:
> > > Hi rarma,
> > >
> > > It's a great and important feature, I think. This PIP requires the
> > > compatibility check from bottom registry only and doesn't touch the
> > > implementation detail. I think we should address this feature in the
> > > future, and this PIP provides the essential ability to implement it.
> > >
> > > Thanks,
> > > Yi
> > >
> > > rocketraman@gmail.com <ro...@gmail.com> 于 2019年9月15日周日 22:36写道:
> > >
> > > > I see a mention of compatibility in the PIP but with no details.  The
> > docs
> > > > about schema compatibility state this:
> > > >
> > > > > Consequently, those events need to go in the same Pulsar partition
> to
> > > > maintain order. This application can use ALWAYS_COMPATIBLE to allow
> > > > different kinds of events co-exist in the same topic.
> > > >
> > > > With this PIP, this limitation can be relaxed, and schema
> compatibility
> > > > should be able to be strengthened, since each type of message on a
> > topic
> > > > can have its own schema, and compatibility can then be checked
> against
> > only
> > > > other schemas for the same type. Kafka does this via the concept of
> > > > "subjects" in the schema registry, and subjects default to just the
> > topic
> > > > name (plus a "-key" or "-value" suffix since keys and values can both
> > have
> > > > their own schemas), but can also include (via an injectable strategy)
> > the
> > > > message type. Compatibility is managed at the subject level.
> > > >
> > > > Is this something that should be addressed in this PIP, or in future
> > > > follow-on work? This is critical to supporting ordering across
> > different
> > > > message types, with schema compatibility verification by Pulsar.
> > > >
> > > > Regards,
> > > > Raman
> > > >
> > > >
> > > >
> > > > On 2019/09/03 05:12:32, 唐谊 <ss...@gmail.com> wrote:
> > > > > Hi all;
> > > > >
> > > > > I am drafting a proposal to support the producer to send messages
> > with
> > > > > different schema.
> > > > >
> > > > > ## Motivation
> > > > > For now, Pulsar producer can only produce messages of one type of
> > schema
> > > > > which is determined by user when it is created, or by fecthing the
> > latest
> > > > > version of schema from registry if AUTO_PRODUCE_BYTES type is
> > specified.
> > > > > Schema, however, can be updated by external system after producer
> > > > started,
> > > > > which would lead to inconsistency between messsage payload and
> schema
> > > > > version metadata. Also some senarios like replicating from kafka
> > require
> > > > a
> > > > > single producer for replicating messages of different schemas from
> > one
> > > > > Kafka partition to one Pulsar partition to guarantee the order and
> no
> > > > > duplicates.
> > > > >
> > > > > Here proposing that messages can indicate the associated schema by
> > > > itself,
> > > > > for more detail,
> > > > >
> > > >
> >
> https://gist.github.com/yittg/56c6dedf7509f634ec7effc4f6f3631d#file-pip-md
> > > > >
> > > > > Looking forward to any feedback.
> > > > >
> > > > > Thanks,
> > > > > Yi
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by Sijie Guo <gu...@gmail.com>.
Hi Raman,

The schema compatibility strategies were already there prior to PIP-43.

PIP-44 enhances the schema compatibility strategy support.

Both of the changes are already landed in 2.5.0 release.

Did you see any issues when you tryout this feature?

- Sijie

On Tue, Apr 14, 2020 at 8:35 AM rocketraman@gmail.com <ro...@gmail.com>
wrote:

> Now that PIP-43 is released in 2.5.0, I wanted to follow up on the
> messages below.
>
> What is remaining to be done in Pulsar to support having multiple
> different types on one topic in Pulsar? Yi indicates below that PIP-43 sets
> the stage for this, but that the schema compatibility implementation still
> would need some work.
>
> Would this require another PIP, or just an issue to track the work?
>
> Regards,
> Raman
>
> On 2019/09/16 01:32:39, Yi Tang <ss...@gmail.com> wrote:
> > Hi rarma,
> >
> > It's a great and important feature, I think. This PIP requires the
> > compatibility check from bottom registry only and doesn't touch the
> > implementation detail. I think we should address this feature in the
> > future, and this PIP provides the essential ability to implement it.
> >
> > Thanks,
> > Yi
> >
> > rocketraman@gmail.com <ro...@gmail.com> 于 2019年9月15日周日 22:36写道:
> >
> > > I see a mention of compatibility in the PIP but with no details.  The
> docs
> > > about schema compatibility state this:
> > >
> > > > Consequently, those events need to go in the same Pulsar partition to
> > > maintain order. This application can use ALWAYS_COMPATIBLE to allow
> > > different kinds of events co-exist in the same topic.
> > >
> > > With this PIP, this limitation can be relaxed, and schema compatibility
> > > should be able to be strengthened, since each type of message on a
> topic
> > > can have its own schema, and compatibility can then be checked against
> only
> > > other schemas for the same type. Kafka does this via the concept of
> > > "subjects" in the schema registry, and subjects default to just the
> topic
> > > name (plus a "-key" or "-value" suffix since keys and values can both
> have
> > > their own schemas), but can also include (via an injectable strategy)
> the
> > > message type. Compatibility is managed at the subject level.
> > >
> > > Is this something that should be addressed in this PIP, or in future
> > > follow-on work? This is critical to supporting ordering across
> different
> > > message types, with schema compatibility verification by Pulsar.
> > >
> > > Regards,
> > > Raman
> > >
> > >
> > >
> > > On 2019/09/03 05:12:32, 唐谊 <ss...@gmail.com> wrote:
> > > > Hi all;
> > > >
> > > > I am drafting a proposal to support the producer to send messages
> with
> > > > different schema.
> > > >
> > > > ## Motivation
> > > > For now, Pulsar producer can only produce messages of one type of
> schema
> > > > which is determined by user when it is created, or by fecthing the
> latest
> > > > version of schema from registry if AUTO_PRODUCE_BYTES type is
> specified.
> > > > Schema, however, can be updated by external system after producer
> > > started,
> > > > which would lead to inconsistency between messsage payload and schema
> > > > version metadata. Also some senarios like replicating from kafka
> require
> > > a
> > > > single producer for replicating messages of different schemas from
> one
> > > > Kafka partition to one Pulsar partition to guarantee the order and no
> > > > duplicates.
> > > >
> > > > Here proposing that messages can indicate the associated schema by
> > > itself,
> > > > for more detail,
> > > >
> > >
> https://gist.github.com/yittg/56c6dedf7509f634ec7effc4f6f3631d#file-pip-md
> > > >
> > > > Looking forward to any feedback.
> > > >
> > > > Thanks,
> > > > Yi
> > > >
> > >
> >
>

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by ro...@gmail.com, ro...@gmail.com.
Now that PIP-43 is released in 2.5.0, I wanted to follow up on the messages below.

What is remaining to be done in Pulsar to support having multiple different types on one topic in Pulsar? Yi indicates below that PIP-43 sets the stage for this, but that the schema compatibility implementation still would need some work.

Would this require another PIP, or just an issue to track the work?

Regards,
Raman

On 2019/09/16 01:32:39, Yi Tang <ss...@gmail.com> wrote: 
> Hi rarma,
> 
> It's a great and important feature, I think. This PIP requires the
> compatibility check from bottom registry only and doesn't touch the
> implementation detail. I think we should address this feature in the
> future, and this PIP provides the essential ability to implement it.
> 
> Thanks,
> Yi
> 
> rocketraman@gmail.com <ro...@gmail.com> 于 2019年9月15日周日 22:36写道:
> 
> > I see a mention of compatibility in the PIP but with no details.  The docs
> > about schema compatibility state this:
> >
> > > Consequently, those events need to go in the same Pulsar partition to
> > maintain order. This application can use ALWAYS_COMPATIBLE to allow
> > different kinds of events co-exist in the same topic.
> >
> > With this PIP, this limitation can be relaxed, and schema compatibility
> > should be able to be strengthened, since each type of message on a topic
> > can have its own schema, and compatibility can then be checked against only
> > other schemas for the same type. Kafka does this via the concept of
> > "subjects" in the schema registry, and subjects default to just the topic
> > name (plus a "-key" or "-value" suffix since keys and values can both have
> > their own schemas), but can also include (via an injectable strategy) the
> > message type. Compatibility is managed at the subject level.
> >
> > Is this something that should be addressed in this PIP, or in future
> > follow-on work? This is critical to supporting ordering across different
> > message types, with schema compatibility verification by Pulsar.
> >
> > Regards,
> > Raman
> >
> >
> >
> > On 2019/09/03 05:12:32, 唐谊 <ss...@gmail.com> wrote:
> > > Hi all;
> > >
> > > I am drafting a proposal to support the producer to send messages with
> > > different schema.
> > >
> > > ## Motivation
> > > For now, Pulsar producer can only produce messages of one type of schema
> > > which is determined by user when it is created, or by fecthing the latest
> > > version of schema from registry if AUTO_PRODUCE_BYTES type is specified.
> > > Schema, however, can be updated by external system after producer
> > started,
> > > which would lead to inconsistency between messsage payload and schema
> > > version metadata. Also some senarios like replicating from kafka require
> > a
> > > single producer for replicating messages of different schemas from one
> > > Kafka partition to one Pulsar partition to guarantee the order and no
> > > duplicates.
> > >
> > > Here proposing that messages can indicate the associated schema by
> > itself,
> > > for more detail,
> > >
> > https://gist.github.com/yittg/56c6dedf7509f634ec7effc4f6f3631d#file-pip-md
> > >
> > > Looking forward to any feedback.
> > >
> > > Thanks,
> > > Yi
> > >
> >
> 

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by Yi Tang <ss...@gmail.com>.
Hi rarma,

It's a great and important feature, I think. This PIP requires the
compatibility check from bottom registry only and doesn't touch the
implementation detail. I think we should address this feature in the
future, and this PIP provides the essential ability to implement it.

Thanks,
Yi

rocketraman@gmail.com <ro...@gmail.com> 于 2019年9月15日周日 22:36写道:

> I see a mention of compatibility in the PIP but with no details.  The docs
> about schema compatibility state this:
>
> > Consequently, those events need to go in the same Pulsar partition to
> maintain order. This application can use ALWAYS_COMPATIBLE to allow
> different kinds of events co-exist in the same topic.
>
> With this PIP, this limitation can be relaxed, and schema compatibility
> should be able to be strengthened, since each type of message on a topic
> can have its own schema, and compatibility can then be checked against only
> other schemas for the same type. Kafka does this via the concept of
> "subjects" in the schema registry, and subjects default to just the topic
> name (plus a "-key" or "-value" suffix since keys and values can both have
> their own schemas), but can also include (via an injectable strategy) the
> message type. Compatibility is managed at the subject level.
>
> Is this something that should be addressed in this PIP, or in future
> follow-on work? This is critical to supporting ordering across different
> message types, with schema compatibility verification by Pulsar.
>
> Regards,
> Raman
>
>
>
> On 2019/09/03 05:12:32, 唐谊 <ss...@gmail.com> wrote:
> > Hi all;
> >
> > I am drafting a proposal to support the producer to send messages with
> > different schema.
> >
> > ## Motivation
> > For now, Pulsar producer can only produce messages of one type of schema
> > which is determined by user when it is created, or by fecthing the latest
> > version of schema from registry if AUTO_PRODUCE_BYTES type is specified.
> > Schema, however, can be updated by external system after producer
> started,
> > which would lead to inconsistency between messsage payload and schema
> > version metadata. Also some senarios like replicating from kafka require
> a
> > single producer for replicating messages of different schemas from one
> > Kafka partition to one Pulsar partition to guarantee the order and no
> > duplicates.
> >
> > Here proposing that messages can indicate the associated schema by
> itself,
> > for more detail,
> >
> https://gist.github.com/yittg/56c6dedf7509f634ec7effc4f6f3631d#file-pip-md
> >
> > Looking forward to any feedback.
> >
> > Thanks,
> > Yi
> >
>

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by ro...@gmail.com, ro...@gmail.com.
I see a mention of compatibility in the PIP but with no details.  The docs about schema compatibility state this:

> Consequently, those events need to go in the same Pulsar partition to maintain order. This application can use ALWAYS_COMPATIBLE to allow different kinds of events co-exist in the same topic.

With this PIP, this limitation can be relaxed, and schema compatibility should be able to be strengthened, since each type of message on a topic can have its own schema, and compatibility can then be checked against only other schemas for the same type. Kafka does this via the concept of "subjects" in the schema registry, and subjects default to just the topic name (plus a "-key" or "-value" suffix since keys and values can both have their own schemas), but can also include (via an injectable strategy) the message type. Compatibility is managed at the subject level.

Is this something that should be addressed in this PIP, or in future follow-on work? This is critical to supporting ordering across different message types, with schema compatibility verification by Pulsar.

Regards,
Raman



On 2019/09/03 05:12:32, 唐谊 <ss...@gmail.com> wrote: 
> Hi all;
> 
> I am drafting a proposal to support the producer to send messages with
> different schema.
> 
> ## Motivation
> For now, Pulsar producer can only produce messages of one type of schema
> which is determined by user when it is created, or by fecthing the latest
> version of schema from registry if AUTO_PRODUCE_BYTES type is specified.
> Schema, however, can be updated by external system after producer started,
> which would lead to inconsistency between messsage payload and schema
> version metadata. Also some senarios like replicating from kafka require a
> single producer for replicating messages of different schemas from one
> Kafka partition to one Pulsar partition to guarantee the order and no
> duplicates.
> 
> Here proposing that messages can indicate the associated schema by itself,
> for more detail,
> https://gist.github.com/yittg/56c6dedf7509f634ec7effc4f6f3631d#file-pip-md
> 
> Looking forward to any feedback.
> 
> Thanks,
> Yi
> 

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by Dave Fisher <wa...@comcast.net>.
So true Sijie!  That said these discussions are critical!

Regards,
Dave

Sent from my iPhone

> On Sep 6, 2019, at 1:12 AM, Sijie Guo <gu...@gmail.com> wrote:
> 
> We haven't defined a PIP process. However the community has run PIP in the
> lazy consensus mode.
> 
> Lazy consensus means you don't have to insist people discuss and/or approve
> your plan, and you certainly don't
> need to call a vote to get approval. You just assume you have the
> communities support unless someone says otherwise.
> 
> So that means go ahead and send the pull requests. If people has
> suggestions or objections, they  will raise in the discussion or in your
> pull requests.
> 
> Thanks,
> Sijie
> 
>> On Thu, Sep 5, 2019 at 11:14 PM Yi Tang <ss...@gmail.com> wrote:
>> 
>> Thanks, Sijie, Penghui. So what's next? I am not familiar with the whole
>> process.
>> 
>> Sijie Guo <gu...@gmail.com> 于 2019年9月6日周五 02:15写道:
>> 
>>> Hi Yi,
>>> 
>>> The  proposal looks pretty good. +1 from me.
>>> 
>>> Looking forward to the implementation.
>>> 
>>> - Sijie
>>> 
>>>> On Thu, Sep 5, 2019 at 7:37 AM Yi Tang <ss...@gmail.com> wrote:
>>>> 
>>>> Hi, Pulsar folks,any more feedbacks?
>>>> 
>>>> PengHui Li <pe...@apache.org> 于 2019年9月3日周二 14:53写道:
>>>> 
>>>>> 👍 Looks good to me +1.
>>>>> 
>>>>> Sijie Guo <gu...@gmail.com> 于2019年9月3日周二 下午2:12写道:
>>>>> 
>>>>>> Thank you Yi.
>>>>>> 
>>>>>> I have copied your gist to PIP-43
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> https://github.com/apache/pulsar/wiki/PIP-43%3A-producer-send-message-with-different-schema
>>>>>> .
>>>>>> 
>>>>>> Thanks,
>>>>>> Sijie
>>>>>> 
>>>>>>> On Mon, Sep 2, 2019 at 10:35 PM 唐谊 <ss...@gmail.com> wrote:
>>>>>>> 
>>>>>>> Here are some previous discussions,
>>>>>>> https://github.com/apache/pulsar/issues/4806
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 


Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by Sijie Guo <gu...@gmail.com>.
We haven't defined a PIP process. However the community has run PIP in the
lazy consensus mode.

Lazy consensus means you don't have to insist people discuss and/or approve
your plan, and you certainly don't
need to call a vote to get approval. You just assume you have the
communities support unless someone says otherwise.

So that means go ahead and send the pull requests. If people has
suggestions or objections, they  will raise in the discussion or in your
pull requests.

Thanks,
Sijie

On Thu, Sep 5, 2019 at 11:14 PM Yi Tang <ss...@gmail.com> wrote:

> Thanks, Sijie, Penghui. So what's next? I am not familiar with the whole
> process.
>
> Sijie Guo <gu...@gmail.com> 于 2019年9月6日周五 02:15写道:
>
> > Hi Yi,
> >
> > The  proposal looks pretty good. +1 from me.
> >
> > Looking forward to the implementation.
> >
> > - Sijie
> >
> > On Thu, Sep 5, 2019 at 7:37 AM Yi Tang <ss...@gmail.com> wrote:
> >
> > > Hi, Pulsar folks,any more feedbacks?
> > >
> > > PengHui Li <pe...@apache.org> 于 2019年9月3日周二 14:53写道:
> > >
> > > > 👍 Looks good to me +1.
> > > >
> > > > Sijie Guo <gu...@gmail.com> 于2019年9月3日周二 下午2:12写道:
> > > >
> > > > > Thank you Yi.
> > > > >
> > > > > I have copied your gist to PIP-43
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/wiki/PIP-43%3A-producer-send-message-with-different-schema
> > > > >  .
> > > > >
> > > > > Thanks,
> > > > > Sijie
> > > > >
> > > > > On Mon, Sep 2, 2019 at 10:35 PM 唐谊 <ss...@gmail.com> wrote:
> > > > >
> > > > > > Here are some previous discussions,
> > > > > > https://github.com/apache/pulsar/issues/4806
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by Yi Tang <ss...@gmail.com>.
Thanks, Sijie, Penghui. So what's next? I am not familiar with the whole
process.

Sijie Guo <gu...@gmail.com> 于 2019年9月6日周五 02:15写道:

> Hi Yi,
>
> The  proposal looks pretty good. +1 from me.
>
> Looking forward to the implementation.
>
> - Sijie
>
> On Thu, Sep 5, 2019 at 7:37 AM Yi Tang <ss...@gmail.com> wrote:
>
> > Hi, Pulsar folks,any more feedbacks?
> >
> > PengHui Li <pe...@apache.org> 于 2019年9月3日周二 14:53写道:
> >
> > > 👍 Looks good to me +1.
> > >
> > > Sijie Guo <gu...@gmail.com> 于2019年9月3日周二 下午2:12写道:
> > >
> > > > Thank you Yi.
> > > >
> > > > I have copied your gist to PIP-43
> > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/wiki/PIP-43%3A-producer-send-message-with-different-schema
> > > >  .
> > > >
> > > > Thanks,
> > > > Sijie
> > > >
> > > > On Mon, Sep 2, 2019 at 10:35 PM 唐谊 <ss...@gmail.com> wrote:
> > > >
> > > > > Here are some previous discussions,
> > > > > https://github.com/apache/pulsar/issues/4806
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by Sijie Guo <gu...@gmail.com>.
Hi Yi,

The  proposal looks pretty good. +1 from me.

Looking forward to the implementation.

- Sijie

On Thu, Sep 5, 2019 at 7:37 AM Yi Tang <ss...@gmail.com> wrote:

> Hi, Pulsar folks,any more feedbacks?
>
> PengHui Li <pe...@apache.org> 于 2019年9月3日周二 14:53写道:
>
> > 👍 Looks good to me +1.
> >
> > Sijie Guo <gu...@gmail.com> 于2019年9月3日周二 下午2:12写道:
> >
> > > Thank you Yi.
> > >
> > > I have copied your gist to PIP-43
> > >
> > >
> >
> https://github.com/apache/pulsar/wiki/PIP-43%3A-producer-send-message-with-different-schema
> > >  .
> > >
> > > Thanks,
> > > Sijie
> > >
> > > On Mon, Sep 2, 2019 at 10:35 PM 唐谊 <ss...@gmail.com> wrote:
> > >
> > > > Here are some previous discussions,
> > > > https://github.com/apache/pulsar/issues/4806
> > > >
> > >
> >
>

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by Yi Tang <ss...@gmail.com>.
Hi, Pulsar folks,any more feedbacks?

PengHui Li <pe...@apache.org> 于 2019年9月3日周二 14:53写道:

> 👍 Looks good to me +1.
>
> Sijie Guo <gu...@gmail.com> 于2019年9月3日周二 下午2:12写道:
>
> > Thank you Yi.
> >
> > I have copied your gist to PIP-43
> >
> >
> https://github.com/apache/pulsar/wiki/PIP-43%3A-producer-send-message-with-different-schema
> >  .
> >
> > Thanks,
> > Sijie
> >
> > On Mon, Sep 2, 2019 at 10:35 PM 唐谊 <ss...@gmail.com> wrote:
> >
> > > Here are some previous discussions,
> > > https://github.com/apache/pulsar/issues/4806
> > >
> >
>

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by PengHui Li <pe...@apache.org>.
👍 Looks good to me +1.

Sijie Guo <gu...@gmail.com> 于2019年9月3日周二 下午2:12写道:

> Thank you Yi.
>
> I have copied your gist to PIP-43
>
> https://github.com/apache/pulsar/wiki/PIP-43%3A-producer-send-message-with-different-schema
>  .
>
> Thanks,
> Sijie
>
> On Mon, Sep 2, 2019 at 10:35 PM 唐谊 <ss...@gmail.com> wrote:
>
> > Here are some previous discussions,
> > https://github.com/apache/pulsar/issues/4806
> >
>

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by Sijie Guo <gu...@gmail.com>.
Thank you Yi.

I have copied your gist to PIP-43
https://github.com/apache/pulsar/wiki/PIP-43%3A-producer-send-message-with-different-schema
 .

Thanks,
Sijie

On Mon, Sep 2, 2019 at 10:35 PM 唐谊 <ss...@gmail.com> wrote:

> Here are some previous discussions,
> https://github.com/apache/pulsar/issues/4806
>

Re: [DISCUSS] PIP: Producer Send Message with Different Schema

Posted by 唐谊 <ss...@gmail.com>.
Here are some previous discussions,
https://github.com/apache/pulsar/issues/4806