You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Tayssir John Gabbour <tj...@pentaside.org> on 2012/09/12 13:52:13 UTC

Defining optional records? (The usual way with unions doesn't work)

Hi!

I'm trying to define optional records via ["null",
<record_definition>]. However, my attempt below fails with:
  "org.apache.avro.AvroTypeException: Unknown union branch a"

(BTW, unions like ["null", "string"] work, but I don't know how to
define non-recursive complex types so I can refer to them briefly by
name like that.)



SCHEMA:
{ "type": "record",
  "name": "User",
  "fields" : [
    {"name": "username", "type": "string"},
    {"name": "x", "type": ["null",
                           {"type": "record",
                            "name": "Test",
                            "fields" : [
                              {"name": "a", "type": "int"}
                            ]}
                          ]}
  ]}

DATA:
{"username": "john", "x": {"a": 1}}
{"username": "ryan", "x": {"a": 1}}

COMMANDLINE:
java -jar avro-tools-1.5.4.jar fromjson --schema-file my-schema.schema
my-data.json -codec snappy > my-output.avro

JAVA VERSION:
"1.6.0_26"


Thanks for any help,
  Tj

Re: Defining optional records? (The usual way with unions doesn't work)

Posted by François Kawala <fk...@bestofmedia.com>.
Hello,

I've encountered the same issue, and finally give up to solve it.

My workaround was to declare a union type with the "potentially null
record schema" and a string. The string being filled with a special
value reflecting the record's absence, if needed.

You can check the following thread, if you want some additional
informations :

http://mail-archives.apache.org/mod_mbox/avro-user/201206.mbox/%3CCALEq1Z84=20PfJahT7DKtXEAGPFfOoHtVYXiyi+O+cZYGdXBNQ@mail.gmail.com%3E

If you are smarted/ braver than me, I'll be please to know a neater
solution.

All the best,
François.

Le 12/09/2012 13:52, Tayssir John Gabbour a écrit :
> Hi!
>
> I'm trying to define optional records via ["null",
> <record_definition>]. However, my attempt below fails with:
>   "org.apache.avro.AvroTypeException: Unknown union branch a"
>
> (BTW, unions like ["null", "string"] work, but I don't know how to
> define non-recursive complex types so I can refer to them briefly by
> name like that.)
>
>
>
> SCHEMA:
> { "type": "record",
>   "name": "User",
>   "fields" : [
>     {"name": "username", "type": "string"},
>     {"name": "x", "type": ["null",
>                            {"type": "record",
>                             "name": "Test",
>                             "fields" : [
>                               {"name": "a", "type": "int"}
>                             ]}
>                           ]}
>   ]}
>
> DATA:
> {"username": "john", "x": {"a": 1}}
> {"username": "ryan", "x": {"a": 1}}
>
> COMMANDLINE:
> java -jar avro-tools-1.5.4.jar fromjson --schema-file my-schema.schema
> my-data.json -codec snappy > my-output.avro
>
> JAVA VERSION:
> "1.6.0_26"
>
>
> Thanks for any help,
>   Tj


Re: Defining optional records? (The usual way with unions doesn't work)

Posted by Doug Cutting <cu...@apache.org>.
On Wed, Sep 12, 2012 at 4:52 AM, Tayssir John Gabbour <tj...@pentaside.org> wrote:
> SCHEMA:
> { "type": "record",
>   "name": "User",
>   "fields" : [
>     {"name": "username", "type": "string"},
>     {"name": "x", "type": ["null",
>                            {"type": "record",
>                             "name": "Test",
>                             "fields" : [
>                               {"name": "a", "type": "int"}
>                             ]}
>                           ]}
>   ]}
>
> DATA:
> {"username": "john", "x": {"a": 1}}
> {"username": "ryan", "x": {"a": 1}}

Non-null union values in Json need to be tagged with the intended branch.

http://avro.apache.org/docs/current/spec.html#json_encoding

So I think this needs to instead be something like:

{"username": "john", "x": {"Test": {"a": 1}}}

If you're generating this Json from Avro then you should use
JsonEncoder or the "tojson" command to do so, not the record's
.toString() method which doesn't tag unions correctly.  (Perhaps we
should fix the .toString() method to also tag unions, but that would
be an incompatible change.)

This has been asked before.  Perhaps we should add it to the FAQ?

Doug