You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Jeff Hodges (JIRA)" <ji...@apache.org> on 2010/04/28 09:25:32 UTC

[jira] Issue Comment Edited: (AVRO-529) Cannot use array type in avro request

    [ https://issues.apache.org/jira/browse/AVRO-529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861698#action_12861698 ] 

Jeff Hodges edited comment on AVRO-529 at 4/28/10 3:24 AM:
-----------------------------------------------------------

Okay, so a more complete example I got from Rael is:

{code}
 {
    "namespace": "somenamespace"
    "protocol": "Composition",
    "types": [],
    "messages": {
      "get_things": {
        "request": [
          { "name": "id", "type": "array", "items": "long" }
        ],
        "response": { "type": "array", "items": "long" }
      }
    }
  }
{code}

The problem line for Wanli and Rael is  the field in the request:

{quote}
"request": [
          { "name": "id", "type": "array", "items": "long" }
        ],
{quote}

This, unfortunately, is not allowed in avro! You see, this type has been "flattened". It should look like 

{quote}
"request": [
          { "name": "id", "type": {"type": "array", "items": "long" }}
        ],
{quote}

This is unintuitive but allowing it parse like above leads to other problems. Say we mean to defined a Users type for the first time in the request field like so:

{quote}
{"name": "mode", "type":{"name":"FileMode", "type": "enum", "symbols": ["w", "r"] }
{quote}

but "flatten" the definition like the "bad" protocol given:

{quote}
{"name": "mode",  "type": "enum", "symbols": ["w", "r"]}
{quote}

This second field is weird! We have the term "name" being overloaded! In the first one, we have a field named "mode" that takes a type  named "FileMode" (remember that enum's require names!). In the second one we have a field named "mode" and then some stuff that could be creating an enum named "mode" but really is just incomplete!

This behavior is slightly confusing when dealing with array and map types, but is consistent with the necessity to "nest" the named types enum, fixed, record, error and, of course, user-defined types.

NOW! We can think about two different things. One is to improve this error message. Basically, if there's more than "type", "name", "doc" and "default" in a record's field, we should through a very specific error saying "nest this type". This is relatively straightforward.

The second thing is to think about changing the spec to allow array and map types to be "flattened" in this manner as they have no "name" conflicts. It does, however, cause weird problems where people will expect primitive behavior when they get complex behavior.

This is really just confusing because the "simple" versions of "primitive" types (i.e. "string" instead of {"type":"string"}) are really confusing for newbies! If we disallowed them, people would immediately glom on to this nesting problem!

So, the first thing is probably doable. The second one requires more thought.

      was (Author: jmhodges):
    Okay, so a more complete example I got from Rael is:

{quote}
 {
    "namespace": "somenamespace"
    "protocol": "Composition",
    "types": [],
    "messages": {
      "get_things": {
        "request": [
          { "name": "id", "type": "array", "items": "long" }
        ],
        "response": { "type": "array", "items": "long" }
      }
    }
  }
{quote}

The problem line for Wanli and Rael is  the field in the request:

{quote}
"request": [
          { "name": "id", "type": "array", "items": "long" }
        ],
{quote}

This, unfortunately, is not allowed in avro! You see, this type has been "flattened". It should look like 

{quote}
"request": [
          { "name": "id", "type": {"type": "array", "items": "long" }}
        ],
{quote}

This is unintuitive but allowing it parse like above leads to other problems. Say we mean to defined a Users type for the first time in the request field like so:

{quote}
{"name": "mode", "type":{"name":"FileMode", "type": "enum", "symbols": ["w", "r"] }
{quote}

but "flatten" the definition like the "bad" protocol given:

{quote}
{"name": "mode",  "type": "enum", "symbols": ["w", "r"]}
{quote}

This second field is weird! We have the term "name" being overloaded! In the first one, we have a field named "mode" that takes a type  named "FileMode" (remember that enum's require names!). In the second one we have a field named "mode" and then some stuff that could be creating an enum named "mode" but really is just incomplete!

This behavior is slightly confusing when dealing with array and map types, but is consistent with the necessity to "nest" the named types enum, fixed, record, error and, of course, user-defined types.

NOW! We can think about two different things. One is to improve this error message. Basically, if there's more than "type", "name", "doc" and "default" in a record's field, we should through a very specific error saying "nest this type". This is relatively straightforward.

The second thing is to think about changing the spec to allow array and map types to be "flattened" in this manner as they have no "name" conflicts. It does, however, cause weird problems where people will expect primitive behavior when they get complex behavior.

This is really just confusing because the "simple" versions of "primitive" types (i.e. "string" instead of {"type":"string"}) are really confusing for newbies! If we disallowed them, people would immediately glom on to this nesting problem!

So, the first thing is probably doable. The second one requires more thought.
  
> Cannot use array type in avro request
> -------------------------------------
>
>                 Key: AVRO-529
>                 URL: https://issues.apache.org/jira/browse/AVRO-529
>             Project: Avro
>          Issue Type: Bug
>          Components: ruby
>    Affects Versions: 1.4.0
>         Environment: Mac OS X
>            Reporter: Wanli Yang
>
> While it is OK to use array in avro response, we get the following error when using it in the type definition. e.g.:
>   {
>     "namespace": "my_namespace",
>     "protocol": "MyProtocol",
>     "types": [
>       {
>         "name": "Thing", "type": "record", "fields": [
>           { "name": "id", "type": "long" },
>           { "name": "other_ids", "type": "array", "items": "long" }
> ...
> /script/../vendor/gems/avro-1.4.0.pre1/lib/avro/schema.rb:78:in `real_parse': "array" is not  a schema we know about. (Avro::SchemaParseError)
> 	from ./script/../vendor/gems/avro-1.4.0.pre1/lib/avro/schema.rb:135:in `subparse'
> 	from ./script/../vendor/gems/avro-1.4.0.pre1/lib/avro/schema.rb:369:in `initialize'
> 	from ./script/../vendor/gems/avro-1.4.0.pre1/lib/avro/schema.rb:180:in `new'
> 	from ./script/../vendor/gems/avro-1.4.0.pre1/lib/avro/schema.rb:180:in `make_field_objects'
> 	from /Library/Ruby/Site/1.8/rubygems/custom_require.rb:31:in `each_with_index'
> 	from ./script/../vendor/gems/avro-1.4.0.pre1/lib/avro/schema.rb:174:in `each'
> 	from ./script/../vendor/gems/avro-1.4.0.pre1/lib/avro/schema.rb:174:in `each_with_index'
> 	from ./script/../vendor/gems/avro-1.4.0.pre1/lib/avro/schema.rb:174:in `make_field_objects'
> 	from ./script/../vendor/gems/avro-1.4.0.pre1/lib/avro/schema.rb:200:in `initialize'
> 	from ./script/../vendor/gems/avro-1.4.0.pre1/lib/avro/schema.rb:53:in `new'
> 	from ./script/../vendor/gems/avro-1.4.0.pre1/lib/avro/schema.rb:53:in `real_parse'
> 	from ./script/../vendor/gems/avro-1.4.0.pre1/lib/avro/protocol.rb:73:in `parse_types'
> 	from ./script/../vendor/gems/avro-1.4.0.pre1/lib/avro/protocol.rb:70:in `collect'
> 	from ./script/../vendor/gems/avro-1.4.0.pre1/lib/avro/protocol.rb:70:in `parse_types'
> 	from ./script/../vendor/gems/avro-1.4.0.pre1/lib/avro/protocol.rb:54:in `initialize'
> 	from ./script/../vendor/gems/avro-1.4.0.pre1/lib/avro/protocol.rb:31:in `new'
> 	from ./script/../vendor/gems/avro-1.4.0.pre1/lib/avro/protocol.rb:31:in `parse'

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.