You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by Tianyu Lang <ti...@squareup.com.INVALID> on 2020/03/25 22:58:42 UTC

Question regarding schema generated from Protobuf

Hello,

I am doing a Java prototype where the publisher converts Protobuf objects
into Avro records to send to a stream, then the consumer deserializes them
back into Protobuf objects. The Protobuf schemas on the publisher and
consumer are the "source of truth" schemas, and I  use "Schema schema =
ProtobufData.get().getSchema(MyProtobufClass.class);" to get the
corresponding Avro schema to convert back and forth.
However, one problem I encountered is: the Avro schema resulted from this
method does not have a default value for repeated (array) Protobuf fields.
This is causing compatibility issues for me: if I update the consumer
schema first by adding a new repeated Protobuf field, the generated Avro
schema cannot handle old messages without this field, as it does not have a
default value to fill in for the new array field.
I can probably work around this by traversing the Avro Schema object and
add a default value for all the array fields, but this feels a bit clumsy.

Is there a better way to do this?
Also, what's the reason behind having no default values for array field?
(Am I going into compatibility edge cases by forcing default values on
array fields?)

Thank you

Re: Question regarding schema generated from Protobuf

Posted by Tianyu Lang <ti...@squareup.com.INVALID>.
Sure thing. I will put up a patch

On Thu, Mar 26, 2020 at 10:18 AM Doug Cutting <cu...@gmail.com> wrote:

> On Wed, Mar 25, 2020 at 3:58 PM Tianyu Lang <ti...@squareup.com.invalid>
> wrote:
>
> > Also, what's the reason behind having no default values for array field?
> >
>
> Perhaps there is no good reason.  For your use case it might make sense to
> automatically supply default values in the schemas generated for Protobuf
> objects.  Perhaps you can create a patch that adds this feature?
>
> Thanks,
>
> Doug
>

Re: Question regarding schema generated from Protobuf

Posted by Doug Cutting <cu...@gmail.com>.
On Wed, Mar 25, 2020 at 3:58 PM Tianyu Lang <ti...@squareup.com.invalid>
wrote:

> Also, what's the reason behind having no default values for array field?
>

Perhaps there is no good reason.  For your use case it might make sense to
automatically supply default values in the schemas generated for Protobuf
objects.  Perhaps you can create a patch that adds this feature?

Thanks,

Doug

Re: Question regarding schema generated from Protobuf

Posted by Tianyu Lang <ti...@squareup.com.INVALID>.
Hi Andy,

https://issues.apache.org/jira/browse/AVRO-2780 is a separate but more
critical issue, while the question I am asking in this thread is more of a
convenience.

Thank you

On Wed, Mar 25, 2020 at 5:44 PM Andy Le <an...@gmail.com> wrote:

> Is this related to your issue
> https://issues.apache.org/jira/browse/AVRO-2780 ?
>
> On 2020/03/25 22:58:42, Tianyu Lang <ti...@squareup.com.INVALID> wrote:
> > Hello,
> >
> > I am doing a Java prototype where the publisher converts Protobuf objects
> > into Avro records to send to a stream, then the consumer deserializes
> them
> > back into Protobuf objects. The Protobuf schemas on the publisher and
> > consumer are the "source of truth" schemas, and I  use "Schema schema =
> > ProtobufData.get().getSchema(MyProtobufClass.class);" to get the
> > corresponding Avro schema to convert back and forth.
> > However, one problem I encountered is: the Avro schema resulted from this
> > method does not have a default value for repeated (array) Protobuf
> fields.
> > This is causing compatibility issues for me: if I update the consumer
> > schema first by adding a new repeated Protobuf field, the generated Avro
> > schema cannot handle old messages without this field, as it does not
> have a
> > default value to fill in for the new array field.
> > I can probably work around this by traversing the Avro Schema object and
> > add a default value for all the array fields, but this feels a bit
> clumsy.
> >
> > Is there a better way to do this?
> > Also, what's the reason behind having no default values for array field?
> > (Am I going into compatibility edge cases by forcing default values on
> > array fields?)
> >
> > Thank you
> >
>

Re: Question regarding schema generated from Protobuf

Posted by Andy Le <an...@gmail.com>.
Is this related to your issue https://issues.apache.org/jira/browse/AVRO-2780 ?

On 2020/03/25 22:58:42, Tianyu Lang <ti...@squareup.com.INVALID> wrote: 
> Hello,
> 
> I am doing a Java prototype where the publisher converts Protobuf objects
> into Avro records to send to a stream, then the consumer deserializes them
> back into Protobuf objects. The Protobuf schemas on the publisher and
> consumer are the "source of truth" schemas, and I  use "Schema schema =
> ProtobufData.get().getSchema(MyProtobufClass.class);" to get the
> corresponding Avro schema to convert back and forth.
> However, one problem I encountered is: the Avro schema resulted from this
> method does not have a default value for repeated (array) Protobuf fields.
> This is causing compatibility issues for me: if I update the consumer
> schema first by adding a new repeated Protobuf field, the generated Avro
> schema cannot handle old messages without this field, as it does not have a
> default value to fill in for the new array field.
> I can probably work around this by traversing the Avro Schema object and
> add a default value for all the array fields, but this feels a bit clumsy.
> 
> Is there a better way to do this?
> Also, what's the reason behind having no default values for array field?
> (Am I going into compatibility edge cases by forcing default values on
> array fields?)
> 
> Thank you
>