You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Marcio Silva (JIRA)" <ji...@apache.org> on 2012/11/14 05:02:11 UTC
[jira] [Created] (AVRO-1204) Schema.createUnion can produce schemas
that aren't parseable due to redefinition of types
Marcio Silva created AVRO-1204:
----------------------------------
Summary: Schema.createUnion can produce schemas that aren't parseable due to redefinition of types
Key: AVRO-1204
URL: https://issues.apache.org/jira/browse/AVRO-1204
Project: Avro
Issue Type: Bug
Components: java
Affects Versions: 1.7.3
Reporter: Marcio Silva
Schemas returned from {{Schema.createUnion}} aren't always re-parsable (after calling Schema.toString()).
If you create a union of types that contain multiple references to the same type, the resulting {{Schema}} instance returned can't be written to JSON and then re-parsed due to a SchemaParseException on type redefinition.
The fix probably involves changes to {{UnionSchema.toJson}} to ensure that any repeated type definitions are replaced by a named reference.
For a concrete example, the union of the following two schemas is problematic:
{code:javascript|title=child-schema.avsc}
{"type":"record",
"name":"UnionTestChild",
"namespace":"org.apache.avro",
"fields":[
{ "name":"intField","type":"int"},
{ "name":"stringField", "type":"string"}
]
}
{code}
{code:javascript|title=parent-schema.avsc}
{
"type":"record",
"name":"UnionTestParent",
"namespace":"org.apache.avro",
"fields":[
{
"name":"child",
"type":{
"type":"record",
"name":"UnionTestChild",
"namespace":"org.apache.avro",
"fields":[
{ "name":"intField","type":"int"},
{ "name":"stringField", "type":"string"}
]
}
}
]
}
{code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1204) Schema.createUnion can produce schemas
that aren't parseable due to redefinition of types
Posted by "Marcio Silva (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marcio Silva updated AVRO-1204:
-------------------------------
Attachment: 0001-Creating-test-case-to-illustrate-Union-schema-proble.patch
A patch that contains a junit test that illustrates the problem.
> Schema.createUnion can produce schemas that aren't parseable due to redefinition of types
> -----------------------------------------------------------------------------------------
>
> Key: AVRO-1204
> URL: https://issues.apache.org/jira/browse/AVRO-1204
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.7.3
> Reporter: Marcio Silva
> Attachments: 0001-Creating-test-case-to-illustrate-Union-schema-proble.patch
>
>
> Schemas returned from {{Schema.createUnion}} aren't always re-parsable (after calling Schema.toString()).
> If you create a union of types that contain multiple references to the same type, the resulting {{Schema}} instance returned can't be written to JSON and then re-parsed due to a SchemaParseException on type redefinition.
> The fix probably involves changes to {{UnionSchema.toJson}} to ensure that any repeated type definitions are replaced by a named reference.
> For a concrete example, the union of the following two schemas is problematic:
> {code:javascript|title=child-schema.avsc}
> {"type":"record",
> "name":"UnionTestChild",
> "namespace":"org.apache.avro",
> "fields":[
> { "name":"intField","type":"int"},
> { "name":"stringField", "type":"string"}
> ]
> }
> {code}
> {code:javascript|title=parent-schema.avsc}
> {
> "type":"record",
> "name":"UnionTestParent",
> "namespace":"org.apache.avro",
> "fields":[
> {
> "name":"child",
> "type":{
> "type":"record",
> "name":"UnionTestChild",
> "namespace":"org.apache.avro",
> "fields":[
> { "name":"intField","type":"int"},
> { "name":"stringField", "type":"string"}
> ]
> }
> }
> ]
> }
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1204) Schema.createUnion can produce
schemas that aren't parseable due to redefinition of types
Posted by "Marcio Silva (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13499294#comment-13499294 ]
Marcio Silva commented on AVRO-1204:
------------------------------------
Sorry, I should've been more thorough in testing the unit test (I stopped when I saw my same error, should've looked at the stack trace) We were seeing this bug with some of our internal schemas. I'll try to see if I can create a better test to reproduce the error we were seeing.
> Schema.createUnion can produce schemas that aren't parseable due to redefinition of types
> -----------------------------------------------------------------------------------------
>
> Key: AVRO-1204
> URL: https://issues.apache.org/jira/browse/AVRO-1204
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.7.2, 1.7.3
> Reporter: Marcio Silva
> Attachments: 0001-Creating-test-case-to-illustrate-Union-schema-proble.patch, AVRO-1204.patch
>
>
> Schemas returned from {{Schema.createUnion}} aren't always re-parsable (after calling Schema.toString()).
> If you create a union of types that contain multiple references to the same type, the resulting {{Schema}} instance returned can't be written to JSON and then re-parsed due to a SchemaParseException on type redefinition.
> The fix probably involves changes to {{UnionSchema.toJson}} to ensure that any repeated type definitions are replaced by a named reference.
> For a concrete example, the union of the following two schemas is problematic:
> {code:javascript|title=child-schema.avsc}
> {"type":"record",
> "name":"UnionTestChild",
> "namespace":"org.apache.avro",
> "fields":[
> { "name":"intField","type":"int"},
> { "name":"stringField", "type":"string"}
> ]
> }
> {code}
> {code:javascript|title=parent-schema.avsc}
> {
> "type":"record",
> "name":"UnionTestParent",
> "namespace":"org.apache.avro",
> "fields":[
> {
> "name":"child",
> "type":{
> "type":"record",
> "name":"UnionTestChild",
> "namespace":"org.apache.avro",
> "fields":[
> { "name":"intField","type":"int"},
> { "name":"stringField", "type":"string"}
> ]
> }
> }
> ]
> }
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1204) Schema.createUnion can produce schemas
that aren't parseable due to redefinition of types
Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Doug Cutting updated AVRO-1204:
-------------------------------
Attachment: AVRO-1204.patch
I was able to make this test pass. I have attached my modified version.
At first it failed in setUp() while parsing the parent schema using a parser that already had the child schema defined. So I changed it to create a new parser for each, since the parent file is a standalone schema and does not refer to the child by name.
Secondly I changed the assertEquals() in assertSchemaParseable() to compare the schema against its printed and parsed equivalent.
I think these were both bugs in the test, not bugs in Avro. So, perhaps there is a bug in Avro you're encountering, but I don't see how this test demonstrates it.
> Schema.createUnion can produce schemas that aren't parseable due to redefinition of types
> -----------------------------------------------------------------------------------------
>
> Key: AVRO-1204
> URL: https://issues.apache.org/jira/browse/AVRO-1204
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.7.2, 1.7.3
> Reporter: Marcio Silva
> Attachments: 0001-Creating-test-case-to-illustrate-Union-schema-proble.patch, AVRO-1204.patch
>
>
> Schemas returned from {{Schema.createUnion}} aren't always re-parsable (after calling Schema.toString()).
> If you create a union of types that contain multiple references to the same type, the resulting {{Schema}} instance returned can't be written to JSON and then re-parsed due to a SchemaParseException on type redefinition.
> The fix probably involves changes to {{UnionSchema.toJson}} to ensure that any repeated type definitions are replaced by a named reference.
> For a concrete example, the union of the following two schemas is problematic:
> {code:javascript|title=child-schema.avsc}
> {"type":"record",
> "name":"UnionTestChild",
> "namespace":"org.apache.avro",
> "fields":[
> { "name":"intField","type":"int"},
> { "name":"stringField", "type":"string"}
> ]
> }
> {code}
> {code:javascript|title=parent-schema.avsc}
> {
> "type":"record",
> "name":"UnionTestParent",
> "namespace":"org.apache.avro",
> "fields":[
> {
> "name":"child",
> "type":{
> "type":"record",
> "name":"UnionTestChild",
> "namespace":"org.apache.avro",
> "fields":[
> { "name":"intField","type":"int"},
> { "name":"stringField", "type":"string"}
> ]
> }
> }
> ]
> }
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1204) Schema.createUnion can produce schemas
that aren't parseable due to redefinition of types
Posted by "Marcio Silva (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marcio Silva updated AVRO-1204:
-------------------------------
Affects Version/s: 1.7.2
> Schema.createUnion can produce schemas that aren't parseable due to redefinition of types
> -----------------------------------------------------------------------------------------
>
> Key: AVRO-1204
> URL: https://issues.apache.org/jira/browse/AVRO-1204
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.7.2, 1.7.3
> Reporter: Marcio Silva
> Attachments: 0001-Creating-test-case-to-illustrate-Union-schema-proble.patch
>
>
> Schemas returned from {{Schema.createUnion}} aren't always re-parsable (after calling Schema.toString()).
> If you create a union of types that contain multiple references to the same type, the resulting {{Schema}} instance returned can't be written to JSON and then re-parsed due to a SchemaParseException on type redefinition.
> The fix probably involves changes to {{UnionSchema.toJson}} to ensure that any repeated type definitions are replaced by a named reference.
> For a concrete example, the union of the following two schemas is problematic:
> {code:javascript|title=child-schema.avsc}
> {"type":"record",
> "name":"UnionTestChild",
> "namespace":"org.apache.avro",
> "fields":[
> { "name":"intField","type":"int"},
> { "name":"stringField", "type":"string"}
> ]
> }
> {code}
> {code:javascript|title=parent-schema.avsc}
> {
> "type":"record",
> "name":"UnionTestParent",
> "namespace":"org.apache.avro",
> "fields":[
> {
> "name":"child",
> "type":{
> "type":"record",
> "name":"UnionTestChild",
> "namespace":"org.apache.avro",
> "fields":[
> { "name":"intField","type":"int"},
> { "name":"stringField", "type":"string"}
> ]
> }
> }
> ]
> }
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira