You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Jakob Homan (JIRA)" <ji...@apache.org> on 2016/02/04 23:15:39 UTC
[jira] [Created] (AVRO-1795) Python2: Cannot parse nested schemas
Jakob Homan created AVRO-1795:
---------------------------------
Summary: Python2: Cannot parse nested schemas
Key: AVRO-1795
URL: https://issues.apache.org/jira/browse/AVRO-1795
Project: Avro
Issue Type: Bug
Components: python
Affects Versions: 1.8.0
Reporter: Jakob Homan
Assignee: Jakob Homan
In the Java client, one can parse nested schemas by loading the nested schema before the nesting schema.
For example, a header can be defined in one file:
{code:javascript}{ "namespace": "python.avro",
"type": "record",
"name": "header",
"fields": [
{ "name": "header_field", "type": "string" }
]
}{code}
and then included in another schema:
{code:javascript}{ "namespace": "python.avro",
"type": "record",
"name": "event",
"fields": [
{ "name": "header", "type": "python.avro.header" },
{ "name": "event_field", "type": "string" }
]
}{code}
As long as one instantiates the Parser and loads the header first, the schemas will be reconciled and merged correctly.
However, the Python client does not support this. The {{parse}} method of the {{schema.py}} file always instantiates a new Names object to hold the schemas:
{code}def parse(json_string):
"""Constructs the Schema from the JSON text."""
# TODO(hammer): preserve stack trace from JSON parse
# parse the JSON
try:
json_data = json.loads(json_string)
except:
raise SchemaParseException('Error parsing JSON: %s' % json_string)
# Initialize the names object
names = Names()
# construct the Avro Schema object
return make_avsc_object(json_data, names){code}
Some possible fixes for this are:
1) Create a separate Parser class to mimic the Schema.Parser Java approach, while deprecating the current parse method.
2) Include Names as a global variable to the parse method, allowing multiple parse calls to populate the same namespace. This breaks current behavior (and at least one unit test depends on it), so would be backwards compatible.
3) Create a new parse method that returns not only the schema, but also the Names instance and accepts that instance. This keeps the code nice and functional while exposing the Names class, which previously had been not particularly public.
I like the first approach.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)