You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Francis Galiegue <fg...@gmail.com> on 2013/02/27 22:47:31 UTC

Question about Avro "records"

This is not written black on white in the spec, so this is a guess: in
its JSON representation, fields defined in a record are the only
permissible fields. And there can be zero fields, in which the JSON
representation is just the empty object.

Is this correct?

-- 
Francis Galiegue, fgaliegue@gmail.com
JSON Schema in Java: http://json-schema-validator.herokuapp.com

Re: Question about Avro "records"

Posted by Doug Cutting <cu...@apache.org>.
I don't think that's valid.  In general a value should be written for
each field in the writer's schema.  It's not ideal for json, but those
are the rules for binary and it's best if such read/write logic can be
unaware of whether json or binary are being produced.  I suspect such
json would break existing implementations, but I've not checked.

Doug

On Wed, Feb 27, 2013 at 3:55 PM, Francis Galiegue <fg...@gmail.com> wrote:
> On Thu, Feb 28, 2013 at 12:10 AM, Doug Cutting <cu...@apache.org> wrote:
>> On Wed, Feb 27, 2013 at 2:51 PM, Francis Galiegue <fg...@gmail.com> wrote:
>>> Hmm, OK, the reader/writer distinction is something I am not accustomed to.
>>
>> http://avro.apache.org/docs/current/spec.html#Schema+Resolution
>>
>> Doug
>
> Hmmm, that does not quite answer my question about production of default values.
>
> For instance, when emitting data from an avro schema which reads:
>
> {
>     "type": "record",
>     "name": "whatever",
>     "fields": [ { "name": "a", "type": { "type": "int", "default": 0 } } ]
> }
>
> is emitting {} legal?
>
>
> --
> Francis Galiegue, fgaliegue@gmail.com
> JSON Schema in Java: http://json-schema-validator.herokuapp.com

Re: Question about Avro "records"

Posted by Doug Cutting <cu...@apache.org>.
On Wed, Feb 27, 2013 at 5:30 PM, Jeremy Kahn <tr...@trochee.net> wrote:
> I'd be happy to add methods to the avro.RecordSchema objects (in the Python
> libraries) that "fill defaults" on missing member fields of a record,
> recursively (which probably means method extension of other schema classes
> as well). Shall I open a JIRA ticket for this for 1.7.5?

That would be great.  Thanks!

> (Does providing this for Python pot me on the hook for such a thing in other
> implementation languages? I hope not.)

Not at all.

Cheers,

Doug

Re: Question about Avro "records"

Posted by Francis Galiegue <fg...@gmail.com>.
On Thu, Feb 28, 2013 at 2:30 AM, Jeremy Kahn <tr...@trochee.net> wrote:
> There seems to be no way to easily use the avro libraries in Python (where I
> feel most qualified to comment) to encode generics with "missing default
> values" and have them transmitted in well-formed avro binary.
>
> If you fill in the "missing" default values, the Python libraries will
> transmit correctly.
>
> I'd be happy to add methods to the avro.RecordSchema objects (in the Python
> libraries) that "fill defaults" on missing member fields of a record,
> recursively (which probably means method extension of other schema classes
> as well). Shall I open a JIRA ticket for this for 1.7.5?
>
> (Does providing this for Python pot me on the hook for such a thing in other
> implementation languages? I hope not.)
>
> For backwards compatibility (and probably to avoid unnecessary data
> traversal), you'll probably want to explicitly ask the schema to fill in
> defaults before transmission in the cases where you'd like to generate the
> impoverished JSON from your example.
>
> This seems related to earlier discussion today about designing constructors
> to generate defaults already filled in.
>

In fact, I was just asking how this should be handled because I have
just finished writing an Avro schema to JSON Schema conversion
processor (which I'll put online soon), so I wanted to be as accurate
as possible when generating schemas ;)

Right now the generated schemas require all properties, even ones
having default values. I was wondering if that was the right thing to
do...

(and next I'm attacking the reverse: JSON Schema to Avro schema...)

-- 
Francis Galiegue, fgaliegue@gmail.com
JSON Schema in Java: http://json-schema-validator.herokuapp.com

Re: Question about Avro "records"

Posted by Jeremy Kahn <tr...@trochee.net>.
There seems to be no way to easily use the avro libraries in Python (where
I feel most qualified to comment) to encode generics with "missing default
values" and have them transmitted in well-formed avro binary.

If you fill in the "missing" default values, the Python libraries will
transmit correctly.

I'd be happy to add methods to the avro.RecordSchema objects (in the Python
libraries) that "fill defaults" on missing member fields of a record,
recursively (which probably means method extension of other schema classes
as well). Shall I open a JIRA ticket for this for 1.7.5?

(Does providing this for Python pot me on the hook for such a thing in
other implementation languages? I hope not.)

For backwards compatibility (and probably to avoid unnecessary data
traversal), you'll probably want to explicitly ask the schema to fill in
defaults before transmission in the cases where you'd like to generate the
impoverished JSON from your example.

This seems related to earlier discussion today about designing constructors
to generate defaults already filled in.

Jeremy
On Feb 27, 2013 3:55 PM, "Francis Galiegue" <fg...@gmail.com> wrote:

> On Thu, Feb 28, 2013 at 12:10 AM, Doug Cutting <cu...@apache.org> wrote:
> > On Wed, Feb 27, 2013 at 2:51 PM, Francis Galiegue <fg...@gmail.com>
> wrote:
> >> Hmm, OK, the reader/writer distinction is something I am not accustomed
> to.
> >
> > http://avro.apache.org/docs/current/spec.html#Schema+Resolution
> >
> > Doug
>
> Hmmm, that does not quite answer my question about production of default
> values.
>
> For instance, when emitting data from an avro schema which reads:
>
> {
>     "type": "record",
>     "name": "whatever",
>     "fields": [ { "name": "a", "type": { "type": "int", "default": 0 } } ]
> }
>
> is emitting {} legal?
>
>
> --
> Francis Galiegue, fgaliegue@gmail.com
> JSON Schema in Java: http://json-schema-validator.herokuapp.com
>

Re: Question about Avro "records"

Posted by Francis Galiegue <fg...@gmail.com>.
On Thu, Feb 28, 2013 at 12:10 AM, Doug Cutting <cu...@apache.org> wrote:
> On Wed, Feb 27, 2013 at 2:51 PM, Francis Galiegue <fg...@gmail.com> wrote:
>> Hmm, OK, the reader/writer distinction is something I am not accustomed to.
>
> http://avro.apache.org/docs/current/spec.html#Schema+Resolution
>
> Doug

Hmmm, that does not quite answer my question about production of default values.

For instance, when emitting data from an avro schema which reads:

{
    "type": "record",
    "name": "whatever",
    "fields": [ { "name": "a", "type": { "type": "int", "default": 0 } } ]
}

is emitting {} legal?


-- 
Francis Galiegue, fgaliegue@gmail.com
JSON Schema in Java: http://json-schema-validator.herokuapp.com

Re: Question about Avro "records"

Posted by Doug Cutting <cu...@apache.org>.
On Wed, Feb 27, 2013 at 2:51 PM, Francis Galiegue <fg...@gmail.com> wrote:
> Hmm, OK, the reader/writer distinction is something I am not accustomed to.

http://avro.apache.org/docs/current/spec.html#Schema+Resolution

Doug

Re: Question about Avro "records"

Posted by Francis Galiegue <fg...@gmail.com>.
On Wed, Feb 27, 2013 at 11:20 PM, Doug Cutting <cu...@apache.org> wrote:
> That sounds right to me.  To be clear, the schema in question here is
> the writer's.  A reader schema which did not have "c" could read this,
> dropping the "c" values from the writer's schema.
>

Hmm, OK, the reader/writer distinction is something I am not accustomed to.

Who can "produce" default values? The reader, the writer or both?

-- 
Francis Galiegue, fgaliegue@gmail.com
JSON Schema in Java: http://json-schema-validator.herokuapp.com

Re: Question about Avro "records"

Posted by Doug Cutting <cu...@apache.org>.
That sounds right to me.  To be clear, the schema in question here is
the writer's.  A reader schema which did not have "c" could read this,
dropping the "c" values from the writer's schema.

Doug

On Wed, Feb 27, 2013 at 2:14 PM, Francis Galiegue <fg...@gmail.com> wrote:
> On Wed, Feb 27, 2013 at 10:54 PM, Pankaj Shroff <sh...@gmail.com> wrote:
>> That doesn't seem the case specially because if you define a record with a
>> bunch of optional fields, then you would end up with an empty object (or
>> rather an object with null values for its fields). Am I misunderstanding
>> your question?
>>
>
> OK, I have probably misworded the question. Let's say I have a record
> defining fields "a" and "b". For simplicity, their permissible values
> are ints.
>
> As I understand it:
>
> { "a": 1 }
>
> is not legal since "b" is not provided.
>
> This:
>
> { "a": 1, "b": 2, "c": 3 }
>
> is not legal either since "c" is not defined.
>
> BUT: { "a": 1 } can be legal IF a default value is provided for "b".
>
> Am I getting this right, partially right, completely wrong?
>
> --
> Francis Galiegue, fgaliegue@gmail.com
> JSON Schema in Java: http://json-schema-validator.herokuapp.com

Re: Question about Avro "records"

Posted by Francis Galiegue <fg...@gmail.com>.
On Wed, Feb 27, 2013 at 11:21 PM, Pankaj Shroff <sh...@gmail.com> wrote:
> Yes thats right, and the default value can be "null" - which is what makes
> "b" an "optional" field.
>
> You can define an optional field by defining it of type "union" in an Avro
> schema where the first type in the union is "null" and the second type is
> "long" or integer in your case.
>
> Something like this (.avsc or .avpr file would have the following Json):
>
>
> {
>     "type": "record",
>     "name": "OptionalFieldsExample",
>     "fields": [
>         {"name": "a",        "type": "long"},
>         {"name": "b", "type": ["null", "long"]},
>         {"name": "c", "type": ["null", "long"]}
>
>
>     ]
> }
>

Is that a reader's or a writer's schema? Sorry for the newbie questions...

-- 
Francis Galiegue, fgaliegue@gmail.com
JSON Schema in Java: http://json-schema-validator.herokuapp.com

Re: Question about Avro "records"

Posted by Pankaj Shroff <sh...@gmail.com>.
Yes thats right, and the default value can be "null" - which is what makes
"b" an "optional" field.

You can define an optional field by defining it of type "union" in an Avro
schema where the first type in the union is "null" and the second type is
"long" or integer in your case.

Something like this (.avsc or .avpr file would have the following Json):

{
    "type": "record",
    "name": "OptionalFieldsExample",
    "fields": [
        {"name": "a",        "type": "long"},
        {"name": "b", "type": ["null", "long"]},
        {"name": "c", "type": ["null", "long"]}

    ]}



On Wed, Feb 27, 2013 at 5:14 PM, Francis Galiegue <fg...@gmail.com>wrote:

> On Wed, Feb 27, 2013 at 10:54 PM, Pankaj Shroff <sh...@gmail.com> wrote:
> > That doesn't seem the case specially because if you define a record with
> a
> > bunch of optional fields, then you would end up with an empty object (or
> > rather an object with null values for its fields). Am I misunderstanding
> > your question?
> >
>
> OK, I have probably misworded the question. Let's say I have a record
> defining fields "a" and "b". For simplicity, their permissible values
> are ints.
>
> As I understand it:
>
> { "a": 1 }
>
> is not legal since "b" is not provided.
>
> This:
>
> { "a": 1, "b": 2, "c": 3 }
>
> is not legal either since "c" is not defined.
>
> BUT: { "a": 1 } can be legal IF a default value is provided for "b".
>
> Am I getting this right, partially right, completely wrong?
>
> --
> Francis Galiegue, fgaliegue@gmail.com
> JSON Schema in Java: http://json-schema-validator.herokuapp.com
>



-- 
Pankaj Shroff
shroffG@Gmail.com

Re: Question about Avro "records"

Posted by Francis Galiegue <fg...@gmail.com>.
On Wed, Feb 27, 2013 at 10:54 PM, Pankaj Shroff <sh...@gmail.com> wrote:
> That doesn't seem the case specially because if you define a record with a
> bunch of optional fields, then you would end up with an empty object (or
> rather an object with null values for its fields). Am I misunderstanding
> your question?
>

OK, I have probably misworded the question. Let's say I have a record
defining fields "a" and "b". For simplicity, their permissible values
are ints.

As I understand it:

{ "a": 1 }

is not legal since "b" is not provided.

This:

{ "a": 1, "b": 2, "c": 3 }

is not legal either since "c" is not defined.

BUT: { "a": 1 } can be legal IF a default value is provided for "b".

Am I getting this right, partially right, completely wrong?

-- 
Francis Galiegue, fgaliegue@gmail.com
JSON Schema in Java: http://json-schema-validator.herokuapp.com

Re: Question about Avro "records"

Posted by Pankaj Shroff <sh...@gmail.com>.
That doesn't seem the case specially because if you define a record with a
bunch of optional fields, then you would end up with an empty object (or
rather an object with null values for its fields). Am I misunderstanding
your question?



On Wed, Feb 27, 2013 at 4:47 PM, Francis Galiegue <fg...@gmail.com>wrote:

> This is not written black on white in the spec, so this is a guess: in
> its JSON representation, fields defined in a record are the only
> permissible fields. And there can be zero fields, in which the JSON
> representation is just the empty object.
>
> Is this correct?
>
> --
> Francis Galiegue, fgaliegue@gmail.com
> JSON Schema in Java: http://json-schema-validator.herokuapp.com
>



-- 
Pankaj Shroff
shroffG@Gmail.com