You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Aaron Dixon <at...@gmail.com> on 2019/06/15 02:01:38 UTC

Aliases with Forward Compatibility

I asked this question on the dev list, but didn't get a response here. (My
original question to the dev list:
https://sematext.com/opensee/m/F2svI1cI2oW1CwdmF1?subj=readers+using+writer+s+aliases+
)

It also seems this question was asked before in late 2018, but dead-ended
at
https://sematext.com/opensee/m/Avro/F2svI1obxDi4WGqf1?subj=Re+Alias+with+Backward+Compatibility

Avro aliases are typically used by *reader* schemas to rename fields.
(I.e., readers can expect "first-name" string and use an alias "firts-name"
to deal with old writer's that had it mispelled in the original writer
schema.) This is backwards compatibility (new readers can read old writers).

However we would like to not have to update reader code to deal with new
writers (ie we want *forward* compatibility with aliases). It seems that
this should be easy: (old) readers could look at the new writer-defined
aliases and leverage them for forward compatibility while doing schema
resolution.

Concrete example: my old schema expects "firts-name"; my new schema fixes
this by introducing "first-name" with the "firts-name" as an alias. Instead
of being obligated to update my old reader(s), couldn't the schema
resolution logic notice this alias and *invert* the aliasing as it reads
the data to give the old reader the field it expects?

Is there a fundamental reason that this isn't part of the avro java impl,
or spec documentation? Not having to coordinate updates to readers during
schema evolution (field renames) would be a huge win imo.

Re: Aliases with Forward Compatibility

Posted by Zoltan Farkas <zo...@yahoo.com>.
Fork is here: https://github.com/zolyfarkas/avro <https://github.com/zolyfarkas/avro>  
When i have time, I try to do PRs against official to make sure things are not too far apart. 
At minimum I make sure I file a JIRA so that when I have a time I can work on  a PR. I appreciate any help with PRs against official.

Currently field name aliases should word the same as in official. Enum symbol aliases is something that exists only in my fork.

let me know if you have any questions.

—Z

> On Jun 15, 2019, at 3:18 PM, Aaron Dixon <at...@gmail.com> wrote:
> 
> Thank you Zoltan. Is your fork publicly available, could I take a look at it?
> 
> On Sat, Jun 15, 2019 at 5:30 AM Zoltan Farkas <zolyfarkas@yahoo.com <ma...@yahoo.com>> wrote:
> I agree with your understanding of how aliases should work, and a lot of developers I interact with expect that aliases should work this way.
> When I implemented https://issues.apache.org/jira/browse/AVRO-1752 <https://issues.apache.org/jira/browse/AVRO-1752> in my avro fork I implemented the resolution the way your describe it.
> 
> I see no reason why this could not be implemented as part of 2.0… but I would let others  with more authority chime in.
> 
> —Z
> 
> 
> 
>> On Jun 14, 2019, at 10:01 PM, Aaron Dixon <atdixon@gmail.com <ma...@gmail.com>> wrote:
>> 
>> I asked this question on the dev list, but didn't get a response here. (My original question to the dev list: https://sematext.com/opensee/m/F2svI1cI2oW1CwdmF1?subj=readers+using+writer+s+aliases+ <https://sematext.com/opensee/m/F2svI1cI2oW1CwdmF1?subj=readers+using+writer+s+aliases+>)
>> 
>> It also seems this question was asked before in late 2018, but dead-ended at https://sematext.com/opensee/m/Avro/F2svI1obxDi4WGqf1?subj=Re+Alias+with+Backward+Compatibility <https://sematext.com/opensee/m/Avro/F2svI1obxDi4WGqf1?subj=Re+Alias+with+Backward+Compatibility>
>> 
>> Avro aliases are typically used by *reader* schemas to rename fields. (I.e., readers can expect "first-name" string and use an alias "firts-name" to deal with old writer's that had it mispelled in the original writer schema.) This is backwards compatibility (new readers can read old writers).
>> 
>> However we would like to not have to update reader code to deal with new writers (ie we want *forward* compatibility with aliases). It seems that this should be easy: (old) readers could look at the new writer-defined aliases and leverage them for forward compatibility while doing schema resolution.
>> 
>> Concrete example: my old schema expects "firts-name"; my new schema fixes this by introducing "first-name" with the "firts-name" as an alias. Instead of being obligated to update my old reader(s), couldn't the schema resolution logic notice this alias and *invert* the aliasing as it reads the data to give the old reader the field it expects?
>> 
>> Is there a fundamental reason that this isn't part of the avro java impl, or spec documentation? Not having to coordinate updates to readers during schema evolution (field renames) would be a huge win imo.
> 


Re: Aliases with Forward Compatibility

Posted by Aaron Dixon <at...@gmail.com>.
Thank you Zoltan. Is your fork publicly available, could I take a look at
it?

On Sat, Jun 15, 2019 at 5:30 AM Zoltan Farkas <zo...@yahoo.com> wrote:

> I agree with your understanding of how aliases should work, and a lot of
> developers I interact with expect that aliases should work this way.
> When I implemented https://issues.apache.org/jira/browse/AVRO-1752 in my
> avro fork I implemented the resolution the way your describe it.
>
> I see no reason why this could not be implemented as part of 2.0… but I
> would let others  with more authority chime in.
>
> —Z
>
>
>
> On Jun 14, 2019, at 10:01 PM, Aaron Dixon <at...@gmail.com> wrote:
>
> I asked this question on the dev list, but didn't get a response here. (My
> original question to the dev list:
> https://sematext.com/opensee/m/F2svI1cI2oW1CwdmF1?subj=readers+using+writer+s+aliases+
> )
>
> It also seems this question was asked before in late 2018, but dead-ended
> at
> https://sematext.com/opensee/m/Avro/F2svI1obxDi4WGqf1?subj=Re+Alias+with+Backward+Compatibility
>
> Avro aliases are typically used by *reader* schemas to rename fields.
> (I.e., readers can expect "first-name" string and use an alias "firts-name"
> to deal with old writer's that had it mispelled in the original writer
> schema.) This is backwards compatibility (new readers can read old writers).
>
> However we would like to not have to update reader code to deal with new
> writers (ie we want *forward* compatibility with aliases). It seems that
> this should be easy: (old) readers could look at the new writer-defined
> aliases and leverage them for forward compatibility while doing schema
> resolution.
>
> Concrete example: my old schema expects "firts-name"; my new schema fixes
> this by introducing "first-name" with the "firts-name" as an alias. Instead
> of being obligated to update my old reader(s), couldn't the schema
> resolution logic notice this alias and *invert* the aliasing as it reads
> the data to give the old reader the field it expects?
>
> Is there a fundamental reason that this isn't part of the avro java impl,
> or spec documentation? Not having to coordinate updates to readers during
> schema evolution (field renames) would be a huge win imo.
>
>
>

Re: Aliases with Forward Compatibility

Posted by Zoltan Farkas <zo...@yahoo.com>.
I agree with your understanding of how aliases should work, and a lot of developers I interact with expect that aliases should work this way.
When I implemented https://issues.apache.org/jira/browse/AVRO-1752 <https://issues.apache.org/jira/browse/AVRO-1752> in my avro fork I implemented the resolution the way your describe it.

I see no reason why this could not be implemented as part of 2.0… but I would let others  with more authority chime in.

—Z



> On Jun 14, 2019, at 10:01 PM, Aaron Dixon <at...@gmail.com> wrote:
> 
> I asked this question on the dev list, but didn't get a response here. (My original question to the dev list: https://sematext.com/opensee/m/F2svI1cI2oW1CwdmF1?subj=readers+using+writer+s+aliases+ <https://sematext.com/opensee/m/F2svI1cI2oW1CwdmF1?subj=readers+using+writer+s+aliases+>)
> 
> It also seems this question was asked before in late 2018, but dead-ended at https://sematext.com/opensee/m/Avro/F2svI1obxDi4WGqf1?subj=Re+Alias+with+Backward+Compatibility <https://sematext.com/opensee/m/Avro/F2svI1obxDi4WGqf1?subj=Re+Alias+with+Backward+Compatibility>
> 
> Avro aliases are typically used by *reader* schemas to rename fields. (I.e., readers can expect "first-name" string and use an alias "firts-name" to deal with old writer's that had it mispelled in the original writer schema.) This is backwards compatibility (new readers can read old writers).
> 
> However we would like to not have to update reader code to deal with new writers (ie we want *forward* compatibility with aliases). It seems that this should be easy: (old) readers could look at the new writer-defined aliases and leverage them for forward compatibility while doing schema resolution.
> 
> Concrete example: my old schema expects "firts-name"; my new schema fixes this by introducing "first-name" with the "firts-name" as an alias. Instead of being obligated to update my old reader(s), couldn't the schema resolution logic notice this alias and *invert* the aliasing as it reads the data to give the old reader the field it expects?
> 
> Is there a fundamental reason that this isn't part of the avro java impl, or spec documentation? Not having to coordinate updates to readers during schema evolution (field renames) would be a huge win imo.