You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Ryan Skraba (Jira)" <ji...@apache.org> on 2022/02/04 10:54:00 UTC
[jira] [Updated] (AVRO-3370) [Spec] Inconsistent behaviour on types as invalid names.

     [ https://issues.apache.org/jira/browse/AVRO-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan Skraba updated AVRO-3370:
------------------------------
    Description: 
We've run across this in some code that interoperates between Java and Python.

The spec [currently forbids|https://avro.apache.org/docs/current/spec.html#names] using a primitive type name as a keyword: _*Primitive type names have no namespace and their names may not be defined in any namespace.*_
{code:java}
{"type":"record","name":"long","fields":[{"name":"a1","type":"long"}]} {code}
That fails in Java with {{"org.apache.avro.AvroTypeException: Schemas may not be named after primitives: long"}}

What do we expect to happen when a named schema uses a complex type?
{code:java}
{"type":"record","name":"record","fields":[{"name":"a1","type":"long"}]} {code}
This currently *succeeds* in Java and the schema can be used to serialize and deserialize data.

This currently *fails* in Python with: {{avro.schema.SchemaParseException: record is a reserved type name}}

Which one is the correct behaviour?

This gets a bit more complicated when we consider using the name as a reference.

The following two schemas both work in Java:
{code:java}
{"type":"record","name":"LinkedList",
"fields":[
  {"name":"value","type":"int},
  {"name":"next","type":["null","LinkedList"]}]}"  {code}
{code:java}
{"type":"record","name":"LinkedList",
"fields":[
  {"name":"value","type":"int},
  {"name":"next","type":["null",{"type":"LinkedList"}]}]}"  
{code}
If we rename {{LinkedList}} to {{record}} the former succeeds in Java and the latter fails with {{{}org.apache.avro.SchemaParseException: No name in schema: {"type":"record"{}}}}

  was:
We've run across this in some code that interoperates between Java and Python.

The spec [currently forbids|https://avro.apache.org/docs/current/spec.html#names] using a primitive type name as a keyword: _*Primitive type names have no namespace and their names may not be defined in any namespace.*_
{code:java}
{"type":"record","name":"long","fields":[{"name":"a1","type":"long"}]} {code}
That fails in Java with {{"org.apache.avro.AvroTypeException: Schemas may not be named after primitives: long"}}

What do we expect to happen when a named schema uses a complex type?
{code:java}
{"type":"record","name":"record","fields":[{"name":"a1","type":"long"}]} {code}
This currently *succeeds* in Java and the schema can be used to serialize and deserialize data.

This currently *fails* in Python with: {{avro.schema.SchemaParseException: record is a reserved type name}}

Which one is the correct behaviour?

This gets a bit more complicated when we consider using the name as a reference.

The following two schemas both work in Java:
{code:java}
{"type":"record","name":"LinkedList",
"fields":[
  {"name":"value","type":"int},
  {"name":"next","type":["null","LinkedList"]}]}"  {code}
{code:java}
{"type":"record","name":"LinkedList",
"fields":[
  {"name":"value","type":"int},
  {"name":"next","type":["null",{"type":"LinkedList"}]}]}"  
{code}
If we rename {{LinkedList}} to {{record}} the former succeeds in Java and the latter fails with {{org.apache.avro.SchemaParseException: No name in schema: \{"type":"record"}}}

{{}}


> [Spec] Inconsistent behaviour on types as invalid names.
> --------------------------------------------------------
>
>                 Key: AVRO-3370
>                 URL: https://issues.apache.org/jira/browse/AVRO-3370
>             Project: Apache Avro
>          Issue Type: Bug
>            Reporter: Ryan Skraba
>            Priority: Major
>
> We've run across this in some code that interoperates between Java and Python.
> The spec [currently forbids|https://avro.apache.org/docs/current/spec.html#names] using a primitive type name as a keyword: _*Primitive type names have no namespace and their names may not be defined in any namespace.*_
> {code:java}
> {"type":"record","name":"long","fields":[{"name":"a1","type":"long"}]} {code}
> That fails in Java with {{"org.apache.avro.AvroTypeException: Schemas may not be named after primitives: long"}}
> What do we expect to happen when a named schema uses a complex type?
> {code:java}
> {"type":"record","name":"record","fields":[{"name":"a1","type":"long"}]} {code}
> This currently *succeeds* in Java and the schema can be used to serialize and deserialize data.
> This currently *fails* in Python with: {{avro.schema.SchemaParseException: record is a reserved type name}}
> Which one is the correct behaviour?
> This gets a bit more complicated when we consider using the name as a reference.
> The following two schemas both work in Java:
> {code:java}
> {"type":"record","name":"LinkedList",
> "fields":[
>   {"name":"value","type":"int},
>   {"name":"next","type":["null","LinkedList"]}]}"  {code}
> {code:java}
> {"type":"record","name":"LinkedList",
> "fields":[
>   {"name":"value","type":"int},
>   {"name":"next","type":["null",{"type":"LinkedList"}]}]}"  
> {code}
> If we rename {{LinkedList}} to {{record}} the former succeeds in Java and the latter fails with {{{}org.apache.avro.SchemaParseException: No name in schema: {"type":"record"{}}}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)