You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Tianxiang Xiong <ti...@fundingcircle.com> on 2016/11/29 00:09:23 UTC

Nesting records in Avro

I posted a question on StackOverflow
<http://stackoverflow.com/questions/40854529/nesting-avro-schemas> r.e.
nesting records in Avro. I think it'll be helpful to make it available here
on the mailing list as well, since the Avro community is not very active on
SO:

According to this question
<https://stackoverflow.com/questions/11764287/how-to-nest-records-in-an-avro-schema>
on
nesting Avro schemas, the right way to nest a record schema is as follows:

{
    "name": "person",
    "type": "record",
    "fields": [
        {"name": "firstname", "type": "string"},
        {"name": "lastname", "type": "string"},
        {
            "name": "address",
            "type": {
                        "type" : "record",
                        "name" : "AddressUSRecord",
                        "fields" : [
                            {"name": "streetaddress", "type": "string"},
                            {"name": "city", "type": "string"}
                        ]
                    },
        }
    ]}

I don't like giving the field the name address and having to give a
different name (AddressUSRecord) to the field's schema. Can I give the
field and schema the same name, address?

What if I want to use the AddressUSRecord schema in multiple other schemas,
not just person? If I want to use AddressUSRecord in another schema, let's
say business, do I have to name it something else?

Ideally, I'd like to define AddressUSRecord in a separate schema, then let
the type of addressreference AddressUSRecord. However, it's not clear that
Avro 1.8.1 supports this out-of-the-box. This 2014 article
<http://www.treselle.com/blog/advanced-avro-schema-design-reuse/> shows
that sub-schemas need to be handled with custom code. What the best way to
define reusable schemas in Avro 1.8.1?

*Note*: I'd like a solution that works with Confluent Inc.'s Schema
Registry. There's a Google Groups thread
<https://groups.google.com/forum/#!topic/confluent-platform/pey5-AS1NSE> that
seems to suggest that Schema Registry does not play nice with schema
references.
-- 

*Tianxiang Xiong *| Clojure Engineer

*tianxiang.xiong@fundingcircle.com <ti...@fundingcircle.com>*

747 Front Street, Floor 4 | San Francisco, CA 94111

Re: Nesting records in Avro

Posted by Tianxiang Xiong <ti...@fundingcircle.com>.
Thanks, but I don't think that would work with the Confluent Schema
Registry. This answer from Niel Drummond
<http://stackoverflow.com/a/40865366/864684> on StackOverflow should be
correct.

On 29 November 2016 at 11:24, suraj acharya <su...@gmail.com> wrote:

> You can try to create a schema with information somewhat like this
>
> {
>     "name": "person",
>     "type": "record",
>     "fields": [
>         {"name": "firstname", "type": "string"},
>         {"name": "lastname", "type": "string"},
>         {
>             "name": "address",
>             "type": {[ "AddressUSRecord" , "null"]
>                     },
>         }
>     ]
> },
> {"name" : "AddressUSRecord",
> "type" : "record",
> "fields" : [
>     {"name": "streetaddress", "type": "string"},
>     {"name": "city", "type": "string"}
> ]
> }
>
> This way, the AddressUSRecord can be used from some other place, say
> billing.
> Hope that helps.
>
> -Suraj Acharya
>
> On Mon, Nov 28, 2016 at 6:09 PM, Tianxiang Xiong <tianxiang.xiong@
> fundingcircle.com> wrote:
>
>> I posted a question on StackOverflow
>> <http://stackoverflow.com/questions/40854529/nesting-avro-schemas> r.e.
>> nesting records in Avro. I think it'll be helpful to make it available here
>> on the mailing list as well, since the Avro community is not very active on
>> SO:
>>
>> According to this question
>> <https://stackoverflow.com/questions/11764287/how-to-nest-records-in-an-avro-schema> on
>> nesting Avro schemas, the right way to nest a record schema is as follows:
>>
>> {
>>     "name": "person",
>>     "type": "record",
>>     "fields": [
>>         {"name": "firstname", "type": "string"},
>>         {"name": "lastname", "type": "string"},
>>         {
>>             "name": "address",
>>             "type": {
>>                         "type" : "record",
>>                         "name" : "AddressUSRecord",
>>                         "fields" : [
>>                             {"name": "streetaddress", "type": "string"},
>>                             {"name": "city", "type": "string"}
>>                         ]
>>                     },
>>         }
>>     ]}
>>
>> I don't like giving the field the name address and having to give a
>> different name (AddressUSRecord) to the field's schema. Can I give the
>> field and schema the same name, address?
>>
>> What if I want to use the AddressUSRecord schema in multiple other
>> schemas, not just person? If I want to use AddressUSRecord in another
>> schema, let's say business, do I have to name it something else?
>>
>> Ideally, I'd like to define AddressUSRecord in a separate schema, then
>> let the type of addressreference AddressUSRecord. However, it's not
>> clear that Avro 1.8.1 supports this out-of-the-box. This 2014 article
>> <http://www.treselle.com/blog/advanced-avro-schema-design-reuse/> shows
>> that sub-schemas need to be handled with custom code. What the best way to
>> define reusable schemas in Avro 1.8.1?
>>
>> *Note*: I'd like a solution that works with Confluent Inc.'s Schema
>> Registry. There's a Google Groups thread
>> <https://groups.google.com/forum/#!topic/confluent-platform/pey5-AS1NSE> that
>> seems to suggest that Schema Registry does not play nice with schema
>> references.
>> --
>>
>> *Tianxiang Xiong *| Clojure Engineer
>>
>> *tianxiang.xiong@fundingcircle.com <ti...@fundingcircle.com>*
>>
>> 747 Front Street, Floor 4 | San Francisco, CA 94111
>>
>
>


-- 

*Tianxiang Xiong *| Clojure Engineer

*tianxiang.xiong@fundingcircle.com <ti...@fundingcircle.com>*

747 Front Street, Floor 4 | San Francisco, CA 94111

Re: Nesting records in Avro

Posted by suraj acharya <su...@gmail.com>.
You can try to create a schema with information somewhat like this

{
    "name": "person",
    "type": "record",
    "fields": [
        {"name": "firstname", "type": "string"},
        {"name": "lastname", "type": "string"},
        {
            "name": "address",
            "type": {[ "AddressUSRecord" , "null"]
                    },
        }
    ]
},
{"name" : "AddressUSRecord",
"type" : "record",
"fields" : [
    {"name": "streetaddress", "type": "string"},
    {"name": "city", "type": "string"}
]
}

This way, the AddressUSRecord can be used from some other place, say
billing.
Hope that helps.

-Suraj Acharya

On Mon, Nov 28, 2016 at 6:09 PM, Tianxiang Xiong <
tianxiang.xiong@fundingcircle.com> wrote:

> I posted a question on StackOverflow
> <http://stackoverflow.com/questions/40854529/nesting-avro-schemas> r.e.
> nesting records in Avro. I think it'll be helpful to make it available here
> on the mailing list as well, since the Avro community is not very active on
> SO:
>
> According to this question
> <https://stackoverflow.com/questions/11764287/how-to-nest-records-in-an-avro-schema> on
> nesting Avro schemas, the right way to nest a record schema is as follows:
>
> {
>     "name": "person",
>     "type": "record",
>     "fields": [
>         {"name": "firstname", "type": "string"},
>         {"name": "lastname", "type": "string"},
>         {
>             "name": "address",
>             "type": {
>                         "type" : "record",
>                         "name" : "AddressUSRecord",
>                         "fields" : [
>                             {"name": "streetaddress", "type": "string"},
>                             {"name": "city", "type": "string"}
>                         ]
>                     },
>         }
>     ]}
>
> I don't like giving the field the name address and having to give a
> different name (AddressUSRecord) to the field's schema. Can I give the
> field and schema the same name, address?
>
> What if I want to use the AddressUSRecord schema in multiple other
> schemas, not just person? If I want to use AddressUSRecord in another
> schema, let's say business, do I have to name it something else?
>
> Ideally, I'd like to define AddressUSRecord in a separate schema, then
> let the type of addressreference AddressUSRecord. However, it's not clear
> that Avro 1.8.1 supports this out-of-the-box. This 2014 article
> <http://www.treselle.com/blog/advanced-avro-schema-design-reuse/> shows
> that sub-schemas need to be handled with custom code. What the best way to
> define reusable schemas in Avro 1.8.1?
>
> *Note*: I'd like a solution that works with Confluent Inc.'s Schema
> Registry. There's a Google Groups thread
> <https://groups.google.com/forum/#!topic/confluent-platform/pey5-AS1NSE> that
> seems to suggest that Schema Registry does not play nice with schema
> references.
> --
>
> *Tianxiang Xiong *| Clojure Engineer
>
> *tianxiang.xiong@fundingcircle.com <ti...@fundingcircle.com>*
>
> 747 Front Street, Floor 4 | San Francisco, CA 94111
>