You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Christopher Armstrong <ch...@idsoftware.com> on 2011/02/11 02:25:39 UTC

record containing array of records in python-avro

Hi guys. I'm having trouble declaring a record with an array of records 
in Python.

Declaring an array of records works just fine:

>>>  print parse('{"name": "stuffs", "type": "array", "items": {"name": "whatever", "type": "record", "fields": [{"name": "a", "type": "string"}]}}')
{"items": {"fields": [{"type": "string", "name": "a"}], "type": "record", "name": "whatever"}, "type": "array"}


But putting that as a field in a record gives me the following:

>>>  print parse('{"name": "outerrecord", "type": "record", "fields": [{"name": "stuffs", "type": "array", "items": {"name": "whatever", "type": "record", "fields": [{"name": "a", "type": "string"}]}}]}')
Traceback (most recent call last):
   File "<stdin>", line 1, in<module>
   File "/home/radix/lib/python/avro-1.4.1-py2.6.egg/avro/schema.py", line 708, in parse
     return make_avsc_object(json_data, names)
   File "/home/radix/lib/python/avro-1.4.1-py2.6.egg/avro/schema.py", line 664, in make_avsc_object
     return RecordSchema(name, namespace, fields, names, type)
   File "/home/radix/lib/python/avro-1.4.1-py2.6.egg/avro/schema.py", line 600, in __init__
     field_objects = RecordSchema.make_field_objects(fields, names)
   File "/home/radix/lib/python/avro-1.4.1-py2.6.egg/avro/schema.py", line 568, in make_field_objects
     new_field = Field(type, name, has_default, default, order, names)
   File "/home/radix/lib/python/avro-1.4.1-py2.6.egg/avro/schema.py", line 310, in __init__
     raise SchemaParseException(fail_msg)
avro.schema.SchemaParseException: Type property "array" not a valid Avro schema: Could not make an Avro Schema object from array.


Any help? Is this just a bug in the python implementation? Is my syntax 
correct?

-- 
Christopher Armstrong, id Software


Re: record containing array of records in python-avro

Posted by Christopher Armstrong <ch...@idsoftware.com>.
On 02/11/2011 12:08 PM, Scott Carey wrote:
> Yes, that is a common mistake.  Try:
>
> {"type":"record","name":"outer","fields":[{"name":"a","type":{"type":"array
> ","items":"string"}}]}
>
>
>
> The problem is that "a" is a field, and the field's name is "a".  Arrays
> don't have names, they are a nameless type.  Therefore, without the
> nesting the "items" part applies to the field, not the array, but it is an
> array property.
>
> Another way to think about it is that "string" is shorthand for
> {"type":"string"}.  Likewise, if you declare and name a record, you can
> simply reference its name after "type": afterwards.  "string" is a
> built-in name.
>
> If there is a bug here, it is that the error message is not that clear to
> someone new with Avro.
>
> In Java, this error now has the message:
>
> org.apache.avro.SchemaParseException: "array" is not a defined name. The
> type of the "a" field must be a defined name or a {"type": ...} expression.
>

Thanks very much, Scott. I see that this is a common problem, as someone 
else just posted with the exact same issue :) I also got help from Doug 
Cutting on IRC about this; thanks to both of you.


> -Scott
>


-- 
Christopher Armstrong, id Software


Re: record containing array of records in python-avro

Posted by Scott Carey <sc...@richrelevance.com>.
Yes, that is a common mistake.  Try:

{"type":"record","name":"outer","fields":[{"name":"a","type":{"type":"array
","items":"string"}}]}



The problem is that "a" is a field, and the field's name is "a".  Arrays
don't have names, they are a nameless type.  Therefore, without the
nesting the "items" part applies to the field, not the array, but it is an
array property.

Another way to think about it is that "string" is shorthand for
{"type":"string"}.  Likewise, if you declare and name a record, you can
simply reference its name after "type": afterwards.  "string" is a
built-in name.

If there is a bug here, it is that the error message is not that clear to
someone new with Avro.

In Java, this error now has the message:

org.apache.avro.SchemaParseException: "array" is not a defined name. The
type of the "a" field must be a defined name or a {"type": ...} expression.

-Scott

On 2/11/11 8:59 AM, "Christopher Armstrong"
<ch...@idsoftware.com> wrote:

>On 02/10/2011 07:25 PM, Christopher Armstrong wrote:
>> Hi guys. I'm having trouble declaring a record with an array of
>> records in Python.
>
>Actually, this is a red herring. The problem is much simpler: I can't
>have a record that contains an array as a field. Is this possible?
>
>
>>>>  parse('{"name": "outer", "type": "record", "fields": [{"name": "a",
>>>>"type": "array", "items": "string"}]}')
>array a False None None<avro.schema.Names object at 0x7f7f8acbb490>
>Traceback (most recent call last):
>   File "<stdin>", line 1, in<module>
>   File "/home/radix/lib/python/avro-1.4.1-py2.6.egg/avro/schema.py",
>line 709, in parse
>     return make_avsc_object(json_data, names)
>   File "/home/radix/lib/python/avro-1.4.1-py2.6.egg/avro/schema.py",
>line 665, in make_avsc_object
>     return RecordSchema(name, namespace, fields, names, type)
>   File "/home/radix/lib/python/avro-1.4.1-py2.6.egg/avro/schema.py",
>line 601, in __init__
>     field_objects = RecordSchema.make_field_objects(fields, names)
>   File "/home/radix/lib/python/avro-1.4.1-py2.6.egg/avro/schema.py",
>line 569, in make_field_objects
>     new_field = Field(type, name, has_default, default, order, names)
>   File "/home/radix/lib/python/avro-1.4.1-py2.6.egg/avro/schema.py",
>line 311, in __init__
>     raise SchemaParseException(fail_msg)
>avro.schema.SchemaParseException: Type property "array" not a valid Avro
>schema: Could not make an Avro Schema object from array.
>
>
>
>
>-- 
>Christopher Armstrong, id Software
>


Re: record containing array of records in python-avro

Posted by Christopher Armstrong <ch...@idsoftware.com>.
On 02/10/2011 07:25 PM, Christopher Armstrong wrote:
> Hi guys. I'm having trouble declaring a record with an array of 
> records in Python.

Actually, this is a red herring. The problem is much simpler: I can't 
have a record that contains an array as a field. Is this possible?


>>>  parse('{"name": "outer", "type": "record", "fields": [{"name": "a", "type": "array", "items": "string"}]}')
array a False None None<avro.schema.Names object at 0x7f7f8acbb490>
Traceback (most recent call last):
   File "<stdin>", line 1, in<module>
   File "/home/radix/lib/python/avro-1.4.1-py2.6.egg/avro/schema.py", line 709, in parse
     return make_avsc_object(json_data, names)
   File "/home/radix/lib/python/avro-1.4.1-py2.6.egg/avro/schema.py", line 665, in make_avsc_object
     return RecordSchema(name, namespace, fields, names, type)
   File "/home/radix/lib/python/avro-1.4.1-py2.6.egg/avro/schema.py", line 601, in __init__
     field_objects = RecordSchema.make_field_objects(fields, names)
   File "/home/radix/lib/python/avro-1.4.1-py2.6.egg/avro/schema.py", line 569, in make_field_objects
     new_field = Field(type, name, has_default, default, order, names)
   File "/home/radix/lib/python/avro-1.4.1-py2.6.egg/avro/schema.py", line 311, in __init__
     raise SchemaParseException(fail_msg)
avro.schema.SchemaParseException: Type property "array" not a valid Avro schema: Could not make an Avro Schema object from array.




-- 
Christopher Armstrong, id Software