You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Yang <te...@gmail.com> on 2011/06/01 01:33:03 UTC

"include" ability in schema file

I have a protocol schema  hello.avpr

{ ....
types: {
   { name : input_msg ,  type: record : fields [
        { name: date , type int },
       { name :msg, type: string}
    ]
  }

}

}


now if I need to serialize the record "input_msg", I'm going to need its
schema,
so I'd need to pass a schema obj to DataFileWriter. but how can I parse the
schema out from schema file?
the schema for "input_msg" is not a standalone file.

I would like a way to write the type of "input_msg" in a standalone file,
and then reference/include that from my hello.avpr.
is this possible?


also btw, for an avro-generate object class, would it be better to make the
getSchema() method static?
that way I can get its schema without creating an actual obj



maybe this is not a big issue, we can simply create some hack scripts to
paste together individual files, i.e. creating our own
"include" macro, but that's kind of kludgy

thanks
Yang

Re: "include" ability in schema file

Posted by Yang <te...@gmail.com>.
got it, thanks Scott

On Wed, Jun 1, 2011 at 10:08 AM, Scott Carey <sc...@richrelevance.com>wrote:

> Yes, I have had to do some automated copy-paste myself.
>
> You could also put Blah and Bar into idl form.
>
> The Avro project would welcome patches to the compilers to fix these sorts
> of issues.  Most of the Java APIs handle these cases, but the code that
> reads files and converts them to Schema objects is fairly simple.
>
> On 6/1/11 9:56 AM, "Yang" <te...@gmail.com> wrote:
>
> actually the link that Doug gave shows that idl can indeed import avpr/avsc
>
>
> back to the record reference problem:  yes I can reference Foo from Bar,
> but since I put Bar and let's say Blah , outside of my idl,
> they are independent .avsc now, if Bar and Blah both references Foo, where
> should I put Foo? I can't reference another .avsc from Bar.avsc or
> Blah.avsc,
> so I have to copy-paste Foo in Blah.avsc and Bar.avsc
>
> the copy-paste can be trivially automated with some perl scripts, it's just
> a bit hacky
>
> Thanks
> Yang
>
> On Wed, Jun 1, 2011 at 9:48 AM, Scott Carey <sc...@richrelevance.com>wrote:
>
>> In an *.avsc you can have multiple records in a JSON array:
>> [
>>  {"type":"record", "name":"Foo", … } ,
>>  {"type":"record", "name":"Bar", … }
>> ]
>> And in the above case, Bar can reference Foo.
>>
>>
>> idl can import other idl's but not an avsc or avpr to my knowledge.
>>
>>
>>
>> On 6/1/11 8:57 AM, "Yang" <te...@gmail.com> wrote:
>>
>> Thanks Doug.
>>
>> this .idl form provides a great tool to better organize my schemas.
>>
>> but a slightly more complicated use case of mine still has problems:
>>
>> let's say my hello.idl reference record type "Orange", so I define
>> "Orange" record type in
>> Orange.avsc,
>> but Orange contains a Fruit record, so I need to define a Fruit.avsc and
>> reference it from Orange,
>> but .avsc does not have the import ability
>>
>>
>> Thanks
>> Yang
>>
>>
>>
>>
>> On Wed, Jun 1, 2011 at 1:06 AM, Doug Cutting <cu...@apache.org> wrote:
>>
>>> Avro IDL supports imports:
>>>
>>> http://avro.apache.org/docs/current/idl.html#imports
>>>
>>> One can also use a protocol with no messages to define multiple,
>>> dependent types.
>>>
>>> To get a type from a protocol file in Java one can use something like:
>>>
>>>  Schema s = Protocol.parse("MyProtocol.avpr").getType("Foo");
>>>
>>> Doug
>>>
>>> On 06/01/2011 01:33 AM, Yang wrote:
>>> > I have a protocol schema  hello.avpr
>>> >
>>> > { ....
>>> > types: {
>>> >    { name : input_msg ,  type: record : fields [
>>> >         { name: date , type int },
>>> >        { name :msg, type: string}
>>> >     ]
>>> >   }
>>> >
>>> > }
>>> >
>>> > }
>>> >
>>> >
>>> > now if I need to serialize the record "input_msg", I'm going to need
>>> its
>>> > schema,
>>> > so I'd need to pass a schema obj to DataFileWriter. but how can I parse
>>> > the schema out from schema file?
>>> > the schema for "input_msg" is not a standalone file.
>>> >
>>> > I would like a way to write the type of "input_msg" in a standalone
>>> > file, and then reference/include that from my hello.avpr.
>>> > is this possible?
>>> >
>>> >
>>> > also btw, for an avro-generate object class, would it be better to make
>>> > the getSchema() method static?
>>> > that way I can get its schema without creating an actual obj
>>> >
>>> >
>>> >
>>> > maybe this is not a big issue, we can simply create some hack scripts
>>> to
>>> > paste together individual files, i.e. creating our own
>>> > "include" macro, but that's kind of kludgy
>>> >
>>> > thanks
>>> > Yang
>>> >
>>>
>>
>>
>

Re: "include" ability in schema file

Posted by Scott Carey <sc...@richrelevance.com>.
Yes, I have had to do some automated copy-paste myself.

You could also put Blah and Bar into idl form.

The Avro project would welcome patches to the compilers to fix these sorts of issues.  Most of the Java APIs handle these cases, but the code that reads files and converts them to Schema objects is fairly simple.

On 6/1/11 9:56 AM, "Yang" <te...@gmail.com>> wrote:

actually the link that Doug gave shows that idl can indeed import avpr/avsc


back to the record reference problem:  yes I can reference Foo from Bar, but since I put Bar and let's say Blah , outside of my idl,
they are independent .avsc now, if Bar and Blah both references Foo, where should I put Foo? I can't reference another .avsc from Bar.avsc or Blah.avsc,
so I have to copy-paste Foo in Blah.avsc and Bar.avsc

the copy-paste can be trivially automated with some perl scripts, it's just a bit hacky

Thanks
Yang

On Wed, Jun 1, 2011 at 9:48 AM, Scott Carey <sc...@richrelevance.com>> wrote:
In an *.avsc you can have multiple records in a JSON array:
[
 {"type":"record", "name":"Foo", … } ,
 {"type":"record", "name":"Bar", … }
]
And in the above case, Bar can reference Foo.


idl can import other idl's but not an avsc or avpr to my knowledge.



On 6/1/11 8:57 AM, "Yang" <te...@gmail.com>> wrote:

Thanks Doug.

this .idl form provides a great tool to better organize my schemas.

but a slightly more complicated use case of mine still has problems:

let's say my hello.idl reference record type "Orange", so I define "Orange" record type in
Orange.avsc,
but Orange contains a Fruit record, so I need to define a Fruit.avsc and reference it from Orange,
but .avsc does not have the import ability


Thanks
Yang




On Wed, Jun 1, 2011 at 1:06 AM, Doug Cutting <cu...@apache.org>> wrote:
Avro IDL supports imports:

http://avro.apache.org/docs/current/idl.html#imports

One can also use a protocol with no messages to define multiple,
dependent types.

To get a type from a protocol file in Java one can use something like:

 Schema s = Protocol.parse("MyProtocol.avpr").getType("Foo");

Doug

On 06/01/2011 01:33 AM, Yang wrote:
> I have a protocol schema  hello.avpr
>
> { ....
> types: {
>    { name : input_msg ,  type: record : fields [
>         { name: date , type int },
>        { name :msg, type: string}
>     ]
>   }
>
> }
>
> }
>
>
> now if I need to serialize the record "input_msg", I'm going to need its
> schema,
> so I'd need to pass a schema obj to DataFileWriter. but how can I parse
> the schema out from schema file?
> the schema for "input_msg" is not a standalone file.
>
> I would like a way to write the type of "input_msg" in a standalone
> file, and then reference/include that from my hello.avpr.
> is this possible?
>
>
> also btw, for an avro-generate object class, would it be better to make
> the getSchema() method static?
> that way I can get its schema without creating an actual obj
>
>
>
> maybe this is not a big issue, we can simply create some hack scripts to
> paste together individual files, i.e. creating our own
> "include" macro, but that's kind of kludgy
>
> thanks
> Yang
>



Re: "include" ability in schema file

Posted by Yang <te...@gmail.com>.
actually the link that Doug gave shows that idl can indeed import avpr/avsc


back to the record reference problem:  yes I can reference Foo from Bar, but
since I put Bar and let's say Blah , outside of my idl,
they are independent .avsc now, if Bar and Blah both references Foo, where
should I put Foo? I can't reference another .avsc from Bar.avsc or
Blah.avsc,
so I have to copy-paste Foo in Blah.avsc and Bar.avsc

the copy-paste can be trivially automated with some perl scripts, it's just
a bit hacky

Thanks
Yang

On Wed, Jun 1, 2011 at 9:48 AM, Scott Carey <sc...@richrelevance.com> wrote:

> In an *.avsc you can have multiple records in a JSON array:
> [
>  {"type":"record", "name":"Foo", … } ,
>  {"type":"record", "name":"Bar", … }
> ]
> And in the above case, Bar can reference Foo.
>
>
> idl can import other idl's but not an avsc or avpr to my knowledge.
>
>
>
> On 6/1/11 8:57 AM, "Yang" <te...@gmail.com> wrote:
>
> Thanks Doug.
>
> this .idl form provides a great tool to better organize my schemas.
>
> but a slightly more complicated use case of mine still has problems:
>
> let's say my hello.idl reference record type "Orange", so I define "Orange"
> record type in
> Orange.avsc,
> but Orange contains a Fruit record, so I need to define a Fruit.avsc and
> reference it from Orange,
> but .avsc does not have the import ability
>
>
> Thanks
> Yang
>
>
>
>
> On Wed, Jun 1, 2011 at 1:06 AM, Doug Cutting <cu...@apache.org> wrote:
>
>> Avro IDL supports imports:
>>
>> http://avro.apache.org/docs/current/idl.html#imports
>>
>> One can also use a protocol with no messages to define multiple,
>> dependent types.
>>
>> To get a type from a protocol file in Java one can use something like:
>>
>>  Schema s = Protocol.parse("MyProtocol.avpr").getType("Foo");
>>
>> Doug
>>
>> On 06/01/2011 01:33 AM, Yang wrote:
>> > I have a protocol schema  hello.avpr
>> >
>> > { ....
>> > types: {
>> >    { name : input_msg ,  type: record : fields [
>> >         { name: date , type int },
>> >        { name :msg, type: string}
>> >     ]
>> >   }
>> >
>> > }
>> >
>> > }
>> >
>> >
>> > now if I need to serialize the record "input_msg", I'm going to need its
>> > schema,
>> > so I'd need to pass a schema obj to DataFileWriter. but how can I parse
>> > the schema out from schema file?
>> > the schema for "input_msg" is not a standalone file.
>> >
>> > I would like a way to write the type of "input_msg" in a standalone
>> > file, and then reference/include that from my hello.avpr.
>> > is this possible?
>> >
>> >
>> > also btw, for an avro-generate object class, would it be better to make
>> > the getSchema() method static?
>> > that way I can get its schema without creating an actual obj
>> >
>> >
>> >
>> > maybe this is not a big issue, we can simply create some hack scripts to
>> > paste together individual files, i.e. creating our own
>> > "include" macro, but that's kind of kludgy
>> >
>> > thanks
>> > Yang
>> >
>>
>
>

Re: "include" ability in schema file

Posted by Scott Carey <sc...@richrelevance.com>.
In an *.avsc you can have multiple records in a JSON array:
[
 {"type":"record", "name":"Foo", … } ,
 {"type":"record", "name":"Bar", … }
]
And in the above case, Bar can reference Foo.


idl can import other idl's but not an avsc or avpr to my knowledge.



On 6/1/11 8:57 AM, "Yang" <te...@gmail.com>> wrote:

Thanks Doug.

this .idl form provides a great tool to better organize my schemas.

but a slightly more complicated use case of mine still has problems:

let's say my hello.idl reference record type "Orange", so I define "Orange" record type in
Orange.avsc,
but Orange contains a Fruit record, so I need to define a Fruit.avsc and reference it from Orange,
but .avsc does not have the import ability


Thanks
Yang




On Wed, Jun 1, 2011 at 1:06 AM, Doug Cutting <cu...@apache.org>> wrote:
Avro IDL supports imports:

http://avro.apache.org/docs/current/idl.html#imports

One can also use a protocol with no messages to define multiple,
dependent types.

To get a type from a protocol file in Java one can use something like:

 Schema s = Protocol.parse("MyProtocol.avpr").getType("Foo");

Doug

On 06/01/2011 01:33 AM, Yang wrote:
> I have a protocol schema  hello.avpr
>
> { ....
> types: {
>    { name : input_msg ,  type: record : fields [
>         { name: date , type int },
>        { name :msg, type: string}
>     ]
>   }
>
> }
>
> }
>
>
> now if I need to serialize the record "input_msg", I'm going to need its
> schema,
> so I'd need to pass a schema obj to DataFileWriter. but how can I parse
> the schema out from schema file?
> the schema for "input_msg" is not a standalone file.
>
> I would like a way to write the type of "input_msg" in a standalone
> file, and then reference/include that from my hello.avpr.
> is this possible?
>
>
> also btw, for an avro-generate object class, would it be better to make
> the getSchema() method static?
> that way I can get its schema without creating an actual obj
>
>
>
> maybe this is not a big issue, we can simply create some hack scripts to
> paste together individual files, i.e. creating our own
> "include" macro, but that's kind of kludgy
>
> thanks
> Yang
>


Re: "include" ability in schema file

Posted by Yang <te...@gmail.com>.
Thanks Doug.

this .idl form provides a great tool to better organize my schemas.

but a slightly more complicated use case of mine still has problems:

let's say my hello.idl reference record type "Orange", so I define "Orange"
record type in
Orange.avsc,
but Orange contains a Fruit record, so I need to define a Fruit.avsc and
reference it from Orange,
but .avsc does not have the import ability


Thanks
Yang




On Wed, Jun 1, 2011 at 1:06 AM, Doug Cutting <cu...@apache.org> wrote:

> Avro IDL supports imports:
>
> http://avro.apache.org/docs/current/idl.html#imports
>
> One can also use a protocol with no messages to define multiple,
> dependent types.
>
> To get a type from a protocol file in Java one can use something like:
>
>  Schema s = Protocol.parse("MyProtocol.avpr").getType("Foo");
>
> Doug
>
> On 06/01/2011 01:33 AM, Yang wrote:
> > I have a protocol schema  hello.avpr
> >
> > { ....
> > types: {
> >    { name : input_msg ,  type: record : fields [
> >         { name: date , type int },
> >        { name :msg, type: string}
> >     ]
> >   }
> >
> > }
> >
> > }
> >
> >
> > now if I need to serialize the record "input_msg", I'm going to need its
> > schema,
> > so I'd need to pass a schema obj to DataFileWriter. but how can I parse
> > the schema out from schema file?
> > the schema for "input_msg" is not a standalone file.
> >
> > I would like a way to write the type of "input_msg" in a standalone
> > file, and then reference/include that from my hello.avpr.
> > is this possible?
> >
> >
> > also btw, for an avro-generate object class, would it be better to make
> > the getSchema() method static?
> > that way I can get its schema without creating an actual obj
> >
> >
> >
> > maybe this is not a big issue, we can simply create some hack scripts to
> > paste together individual files, i.e. creating our own
> > "include" macro, but that's kind of kludgy
> >
> > thanks
> > Yang
> >
>

Re: "include" ability in schema file

Posted by Doug Cutting <cu...@apache.org>.
Avro IDL supports imports:

http://avro.apache.org/docs/current/idl.html#imports

One can also use a protocol with no messages to define multiple,
dependent types.

To get a type from a protocol file in Java one can use something like:

  Schema s = Protocol.parse("MyProtocol.avpr").getType("Foo");

Doug

On 06/01/2011 01:33 AM, Yang wrote:
> I have a protocol schema  hello.avpr
> 
> { ....
> types: {
>    { name : input_msg ,  type: record : fields [
>         { name: date , type int },
>        { name :msg, type: string}
>     ]
>   }
> 
> }
> 
> }
> 
> 
> now if I need to serialize the record "input_msg", I'm going to need its
> schema,
> so I'd need to pass a schema obj to DataFileWriter. but how can I parse
> the schema out from schema file?
> the schema for "input_msg" is not a standalone file. 
> 
> I would like a way to write the type of "input_msg" in a standalone
> file, and then reference/include that from my hello.avpr.
> is this possible?
> 
> 
> also btw, for an avro-generate object class, would it be better to make
> the getSchema() method static?
> that way I can get its schema without creating an actual obj
> 
> 
> 
> maybe this is not a big issue, we can simply create some hack scripts to
> paste together individual files, i.e. creating our own
> "include" macro, but that's kind of kludgy
> 
> thanks
> Yang
>