You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Marcio Silva (JIRA)" <ji...@apache.org> on 2012/11/14 05:02:11 UTC

[jira] [Created] (AVRO-1204) Schema.createUnion can produce schemas that aren't parseable due to redefinition of types

Marcio Silva created AVRO-1204:
----------------------------------

             Summary: Schema.createUnion can produce schemas that aren't parseable due to redefinition of types
                 Key: AVRO-1204
                 URL: https://issues.apache.org/jira/browse/AVRO-1204
             Project: Avro
          Issue Type: Bug
          Components: java
    Affects Versions: 1.7.3
            Reporter: Marcio Silva


Schemas returned from {{Schema.createUnion}} aren't always re-parsable (after calling Schema.toString()).

If you create a union of types that contain multiple references to the same type, the resulting {{Schema}} instance returned can't be written to JSON and then re-parsed due to a SchemaParseException on type redefinition.

The fix probably involves changes to {{UnionSchema.toJson}} to ensure that any repeated type definitions are replaced by a named reference.

For a concrete example, the union of the following two schemas is problematic:
{code:javascript|title=child-schema.avsc}
{"type":"record",
 "name":"UnionTestChild",
 "namespace":"org.apache.avro",
 "fields":[
               { "name":"intField","type":"int"},
               { "name":"stringField", "type":"string"}
           ]
}
{code}

{code:javascript|title=parent-schema.avsc}
{
   "type":"record",
   "name":"UnionTestParent",
   "namespace":"org.apache.avro",
   "fields":[
      {
         "name":"child",
         "type":{
            "type":"record",
            "name":"UnionTestChild",
            "namespace":"org.apache.avro",
            "fields":[
               { "name":"intField","type":"int"},
               { "name":"stringField", "type":"string"}
            ]
         }
      }
   ]
}
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AVRO-1204) Schema.createUnion can produce schemas that aren't parseable due to redefinition of types

Posted by "Marcio Silva (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcio Silva updated AVRO-1204:
-------------------------------

    Attachment: 0001-Creating-test-case-to-illustrate-Union-schema-proble.patch

A patch that contains a junit test that illustrates the problem.
                
> Schema.createUnion can produce schemas that aren't parseable due to redefinition of types
> -----------------------------------------------------------------------------------------
>
>                 Key: AVRO-1204
>                 URL: https://issues.apache.org/jira/browse/AVRO-1204
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.7.3
>            Reporter: Marcio Silva
>         Attachments: 0001-Creating-test-case-to-illustrate-Union-schema-proble.patch
>
>
> Schemas returned from {{Schema.createUnion}} aren't always re-parsable (after calling Schema.toString()).
> If you create a union of types that contain multiple references to the same type, the resulting {{Schema}} instance returned can't be written to JSON and then re-parsed due to a SchemaParseException on type redefinition.
> The fix probably involves changes to {{UnionSchema.toJson}} to ensure that any repeated type definitions are replaced by a named reference.
> For a concrete example, the union of the following two schemas is problematic:
> {code:javascript|title=child-schema.avsc}
> {"type":"record",
>  "name":"UnionTestChild",
>  "namespace":"org.apache.avro",
>  "fields":[
>                { "name":"intField","type":"int"},
>                { "name":"stringField", "type":"string"}
>            ]
> }
> {code}
> {code:javascript|title=parent-schema.avsc}
> {
>    "type":"record",
>    "name":"UnionTestParent",
>    "namespace":"org.apache.avro",
>    "fields":[
>       {
>          "name":"child",
>          "type":{
>             "type":"record",
>             "name":"UnionTestChild",
>             "namespace":"org.apache.avro",
>             "fields":[
>                { "name":"intField","type":"int"},
>                { "name":"stringField", "type":"string"}
>             ]
>          }
>       }
>    ]
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (AVRO-1204) Schema.createUnion can produce schemas that aren't parseable due to redefinition of types

Posted by "Marcio Silva (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13499294#comment-13499294 ] 

Marcio Silva commented on AVRO-1204:
------------------------------------

Sorry, I should've been more thorough in testing the unit test (I stopped when I saw my same error, should've looked at the stack trace) We were seeing this bug with some of our internal schemas.  I'll try to see if I can create a better test to reproduce the error we were seeing.
                
> Schema.createUnion can produce schemas that aren't parseable due to redefinition of types
> -----------------------------------------------------------------------------------------
>
>                 Key: AVRO-1204
>                 URL: https://issues.apache.org/jira/browse/AVRO-1204
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.7.2, 1.7.3
>            Reporter: Marcio Silva
>         Attachments: 0001-Creating-test-case-to-illustrate-Union-schema-proble.patch, AVRO-1204.patch
>
>
> Schemas returned from {{Schema.createUnion}} aren't always re-parsable (after calling Schema.toString()).
> If you create a union of types that contain multiple references to the same type, the resulting {{Schema}} instance returned can't be written to JSON and then re-parsed due to a SchemaParseException on type redefinition.
> The fix probably involves changes to {{UnionSchema.toJson}} to ensure that any repeated type definitions are replaced by a named reference.
> For a concrete example, the union of the following two schemas is problematic:
> {code:javascript|title=child-schema.avsc}
> {"type":"record",
>  "name":"UnionTestChild",
>  "namespace":"org.apache.avro",
>  "fields":[
>                { "name":"intField","type":"int"},
>                { "name":"stringField", "type":"string"}
>            ]
> }
> {code}
> {code:javascript|title=parent-schema.avsc}
> {
>    "type":"record",
>    "name":"UnionTestParent",
>    "namespace":"org.apache.avro",
>    "fields":[
>       {
>          "name":"child",
>          "type":{
>             "type":"record",
>             "name":"UnionTestChild",
>             "namespace":"org.apache.avro",
>             "fields":[
>                { "name":"intField","type":"int"},
>                { "name":"stringField", "type":"string"}
>             ]
>          }
>       }
>    ]
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AVRO-1204) Schema.createUnion can produce schemas that aren't parseable due to redefinition of types

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated AVRO-1204:
-------------------------------

    Attachment: AVRO-1204.patch

I was able to make this test pass.  I have attached my modified version.

At first it failed in setUp() while parsing the parent schema using a parser that already had the child schema defined.  So I changed it to create a new parser for each, since the parent file is a standalone schema and does not refer to the child by name.

Secondly I changed the assertEquals() in assertSchemaParseable() to compare the schema against its printed and parsed equivalent.

I think these were both bugs in the test, not bugs in Avro.  So, perhaps there is a bug in Avro you're encountering, but I don't see how this test demonstrates it.
                
> Schema.createUnion can produce schemas that aren't parseable due to redefinition of types
> -----------------------------------------------------------------------------------------
>
>                 Key: AVRO-1204
>                 URL: https://issues.apache.org/jira/browse/AVRO-1204
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.7.2, 1.7.3
>            Reporter: Marcio Silva
>         Attachments: 0001-Creating-test-case-to-illustrate-Union-schema-proble.patch, AVRO-1204.patch
>
>
> Schemas returned from {{Schema.createUnion}} aren't always re-parsable (after calling Schema.toString()).
> If you create a union of types that contain multiple references to the same type, the resulting {{Schema}} instance returned can't be written to JSON and then re-parsed due to a SchemaParseException on type redefinition.
> The fix probably involves changes to {{UnionSchema.toJson}} to ensure that any repeated type definitions are replaced by a named reference.
> For a concrete example, the union of the following two schemas is problematic:
> {code:javascript|title=child-schema.avsc}
> {"type":"record",
>  "name":"UnionTestChild",
>  "namespace":"org.apache.avro",
>  "fields":[
>                { "name":"intField","type":"int"},
>                { "name":"stringField", "type":"string"}
>            ]
> }
> {code}
> {code:javascript|title=parent-schema.avsc}
> {
>    "type":"record",
>    "name":"UnionTestParent",
>    "namespace":"org.apache.avro",
>    "fields":[
>       {
>          "name":"child",
>          "type":{
>             "type":"record",
>             "name":"UnionTestChild",
>             "namespace":"org.apache.avro",
>             "fields":[
>                { "name":"intField","type":"int"},
>                { "name":"stringField", "type":"string"}
>             ]
>          }
>       }
>    ]
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AVRO-1204) Schema.createUnion can produce schemas that aren't parseable due to redefinition of types

Posted by "Marcio Silva (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcio Silva updated AVRO-1204:
-------------------------------

    Affects Version/s: 1.7.2
    
> Schema.createUnion can produce schemas that aren't parseable due to redefinition of types
> -----------------------------------------------------------------------------------------
>
>                 Key: AVRO-1204
>                 URL: https://issues.apache.org/jira/browse/AVRO-1204
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.7.2, 1.7.3
>            Reporter: Marcio Silva
>         Attachments: 0001-Creating-test-case-to-illustrate-Union-schema-proble.patch
>
>
> Schemas returned from {{Schema.createUnion}} aren't always re-parsable (after calling Schema.toString()).
> If you create a union of types that contain multiple references to the same type, the resulting {{Schema}} instance returned can't be written to JSON and then re-parsed due to a SchemaParseException on type redefinition.
> The fix probably involves changes to {{UnionSchema.toJson}} to ensure that any repeated type definitions are replaced by a named reference.
> For a concrete example, the union of the following two schemas is problematic:
> {code:javascript|title=child-schema.avsc}
> {"type":"record",
>  "name":"UnionTestChild",
>  "namespace":"org.apache.avro",
>  "fields":[
>                { "name":"intField","type":"int"},
>                { "name":"stringField", "type":"string"}
>            ]
> }
> {code}
> {code:javascript|title=parent-schema.avsc}
> {
>    "type":"record",
>    "name":"UnionTestParent",
>    "namespace":"org.apache.avro",
>    "fields":[
>       {
>          "name":"child",
>          "type":{
>             "type":"record",
>             "name":"UnionTestChild",
>             "namespace":"org.apache.avro",
>             "fields":[
>                { "name":"intField","type":"int"},
>                { "name":"stringField", "type":"string"}
>             ]
>          }
>       }
>    ]
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira