You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Enrico Olivelli <eo...@gmail.com> on 2021/05/04 06:37:12 UTC

PIP-85 Add Schema Information to Message in Java Client API

Hello Pulsar community,
I would like to start a discussion about adding a new Message.getSchema() API.

This is a Google Doc with the contents of the PIP
https://docs.google.com/document/d/1VWi5LHP44V31nP4bCui9d5RXwH6xc_phrUes6tvNguk

This feature is particularly needed on the Consumer side, when you are
using Schema.AUTO_CONSUME().

When you use Schema.AUTO_CONSUME() Pulsar downloads the Schema from
the Schema Registry.
The message is a Message<GenericRecord>, but recently we introduced
GenericObject, and now it works with every Schema: BYTES, primities,
KeyValue and Structures (AVRO,JSON, Protobuf).
Just use GenericObject.getNativeObject() to access the decoded Java Object.

Currently we miss a way to get the actual schema per each message: in
fact each message can have a Schema different for the other messages
in the same topic.

Main requirements for the Schema instance returned by Message.getSchema():
- it must represent the actual schema for the message
- it must return accurate an SchemaInfo for the message
- it must return a Native Schema (like a native AVRO Schema) for the message

This is the PR with the implementation
https://github.com/apache/pulsar/pull/10476

Best regards
Enrico Olivelli

Re: PIP-85 Add Schema Information to Message in Java Client API

Posted by Enrico Olivelli <eo...@gmail.com>.
Hi,
The discussion is moving forward on the gdoc.
Thanks to Penghui who is giving lots of useful feedback, and also to
Vincent is testing the new API with real world advanced usecases.

It would be good to see more participants as this is an important API
addition

Enrico

Il Mar 4 Mag 2021, 08:37 Enrico Olivelli <eo...@gmail.com> ha scritto:

> Hello Pulsar community,
> I would like to start a discussion about adding a new Message.getSchema()
> API.
>
> This is a Google Doc with the contents of the PIP
>
> https://docs.google.com/document/d/1VWi5LHP44V31nP4bCui9d5RXwH6xc_phrUes6tvNguk
>
> This feature is particularly needed on the Consumer side, when you are
> using Schema.AUTO_CONSUME().
>
> When you use Schema.AUTO_CONSUME() Pulsar downloads the Schema from
> the Schema Registry.
> The message is a Message<GenericRecord>, but recently we introduced
> GenericObject, and now it works with every Schema: BYTES, primities,
> KeyValue and Structures (AVRO,JSON, Protobuf).
> Just use GenericObject.getNativeObject() to access the decoded Java Object.
>
> Currently we miss a way to get the actual schema per each message: in
> fact each message can have a Schema different for the other messages
> in the same topic.
>
> Main requirements for the Schema instance returned by Message.getSchema():
> - it must represent the actual schema for the message
> - it must return accurate an SchemaInfo for the message
> - it must return a Native Schema (like a native AVRO Schema) for the
> message
>
> This is the PR with the implementation
> https://github.com/apache/pulsar/pull/10476
>
> Best regards
> Enrico Olivelli
>