You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Shannon Carey <sc...@expedia.com> on 2017/07/07 16:21:39 UTC

Re-using record & enum types in single schema file

I'm trying to see if I can use a single Avro schema file but still have multiple record & enum types re-used throughout different places in the schema.

Based on these answers https://stackoverflow.com/questions/21539113/can-i-split-an-apache-avro-schema-across-multiple-files and https://stackoverflow.com/questions/29280635/avro-schema-definition-nesting-types the only two ways are:

1. Use a union as the root type, and define all your re-used types as members of the union. Unfortunately, this approach is not ideal as it allows any of those types to be serialized as the root object regardless of which types you want to be the root.

Or, 2. Define a Protocol instead of a Schema, putting all your types into "types". This appears to have the same problem as #1.

Is that right? Is there any other way? I'd rather only have to reference one schema file instead of multiple.

Thanks,
Shannon

Re: Re-using record & enum types in single schema file

Posted by Dan Schmitt <da...@gmail.com>.
Try the protocol/idl stuff; docs are at:

https://avro.apache.org/docs/1.8.2/idl.html

The idl stuff is much nicer about syntax, does the merging (only
declares the enums where they are used, doesn't drop them in a global)
and will create schemas for each compound avro type.

Sam's suggestion works too, but the organization is a hassle.   The
idl is a much nicer syntax.




On Fri, Jul 7, 2017 at 2:29 PM, Sam Groth <sg...@yahoo-inc.com> wrote:
> At least in Java, my understanding is that non-primitive type names are
> global to that schema file so you could use approach 1, but instead of
> defining all types at the root union, define them at the first subschema use
> and then refer to their type by name in the rest of the use cases.
> Disclaimer: It's been a long time since I tried this so I could be wrong.
>
>
> Hope this helps.
> Sam
>
>
> On Friday, July 7, 2017 11:22 AM, Shannon Carey <sc...@expedia.com> wrote:
>
>
> I'm trying to see if I can use a single Avro schema file but still have
> multiple record & enum types re-used throughout different places in the
> schema.
>
> Based on these answers
> https://stackoverflow.com/questions/21539113/can-i-split-an-apache-avro-schema-across-multiple-files
> and
> https://stackoverflow.com/questions/29280635/avro-schema-definition-nesting-types
> the only two ways are:
>
> 1. Use a union as the root type, and define all your re-used types as
> members of the union. Unfortunately, this approach is not ideal as it allows
> any of those types to be serialized as the root object regardless of which
> types you want to be the root.
>
> Or, 2. Define a Protocol instead of a Schema, putting all your types into
> "types". This appears to have the same problem as #1.
>
> Is that right? Is there any other way? I'd rather only have to reference one
> schema file instead of multiple.
>
> Thanks,
> Shannon
>
>

Re: Re-using record & enum types in single schema file

Posted by Sam Groth <sg...@yahoo-inc.com>.
At least in Java, my understanding is that non-primitive type names are global to that schema file so you could use approach 1, but instead of defining all types at the root union, define them at the first subschema use and then refer to their type by name in the rest of the use cases. Disclaimer: It's been a long time since I tried this so I could be wrong.

Hope this helps.Sam 

    On Friday, July 7, 2017 11:22 AM, Shannon Carey <sc...@expedia.com> wrote:
 

  I'm trying to see if I can use a single Avro schema file but still have multiple record & enum types re-used throughout different places in the schema.
Based on these answers https://stackoverflow.com/questions/21539113/can-i-split-an-apache-avro-schema-across-multiple-files and https://stackoverflow.com/questions/29280635/avro-schema-definition-nesting-types the only two ways are:
1. Use a union as the root type, and define all your re-used types as members of the union. Unfortunately, this approach is not ideal as it allows any of those types to be serialized as the root object regardless of which types you want to be the root.
Or, 2. Define a Protocol instead of a Schema, putting all your types into "types". This appears to have the same problem as #1.
Is that right? Is there any other way? I'd rather only have to reference one schema file instead of multiple.
Thanks,Shannon