You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Saptarshi Guha <sg...@mozilla.com> on 2012/06/25 08:17:12 UTC
C/C++ parsing vs. Java parsing.
I have a avro scheme found here: http://sguha.pastebin.mozilla.org/1677671
I tried
java -jar avro-tools-1.7.0.jar compile schema ~/tmp/robject.avro foo
and it worked.
This failed:
avrogencpp --input ~/tmp/robject.avro --output ~/tmp/h2
Segmentation fault: 11
This failed:
avro_schema_t *person_schema = (avro_schema_t*)malloc(sizeof(avro_schema_t));
(avro_schema_from_json_literal(string.of.avro.file), person_schema)
with
Error was Error parsing JSON: string or '}' expected near end of file
Q1: Does C and C++ API support all schemas the Java one supports?
Q2: Is it yes to Q1 and this is a bug?
Regards
Saptarshi
Re: C/C++ parsing vs. Java parsing.
Posted by Douglas Creager <do...@creagertino.net>.
> 3. C
>
> avro_schema_t *person_schema = (avro_schema_t*)malloc(sizeof(avro_schema_t));
> (avro_schema_from_json_literal(jsonstring, person_schema))
>
> returns:
>
> Error was Error parsing JSON: string or '}' expected near end of file
>
> So is this a bug? or am i calling it wrong.
That error message is from the JSON parser we use internally — it claims that there's a syntax error in the JSON that you've passed in. Can you send us the snippet where you define jsonstring? It might be an issue of escaping things correctly in the C string literal. Also, there's a comment where avro_schema_from_json_literal is defined, saying that jsonstring must be defined as a "char[]" and not a "char *". And of course it could also be an actual syntax error. :-)
–doug
Re: C/C++ parsing vs. Java parsing.
Posted by Saptarshi Guha <sg...@mozilla.com>.
I should mention,
a) I need Java and C - because the messages will be consumed by Java and C
b) I'd rather stay away from C++ because of the Boost dependency - nothing against it
just becomes another installation hurdle
c) I need to check with other languages e.g. Python since i look forward to language interop.
Thanks again
Saptarshi
----- Original Message -----
From: "Saptarshi Guha" <sg...@mozilla.com>
To: user@avro.apache.org
Sent: Monday, June 25, 2012 10:27:45 PM
Subject: Re: C/C++ parsing vs. Java parsing.
Hi Scott,
Thanks for the response. I changed the avro file to [1]
1. Java works.
2. avrocppgen
avrogencpp -i ~/tmp/robject.avro -o foo
works.
3. C
avro_schema_t *person_schema = (avro_schema_t*)malloc(sizeof(avro_schema_t));
(avro_schema_from_json_literal(jsonstring, person_schema))
returns:
Error was Error parsing JSON: string or '}' expected near end of file
So is this a bug? or am i calling it wrong.
Ideally, i would like a union of
["NULL","RAW","INTEGER","REAL","COMPLEX","LOGICAL","STRING","LIST"]}}
Each of these is a record of a 1) a type (might be array of integers, though COMPLEX is array of records)
and (2) another field called Attributes.
e.g
[
{"type":"record",
"name":"REAL",
"fields":[
{"name":"whattype", "type":"myrtype"},
{"name":"value", "type":"array" , "items":"double"},
{"name":"attrs" , "type":"attrytpe"}
]
},
{"type":"record",
"name":"INTEGER",
"fields":[
{"name":"whattype", "type":"myrtype"},
{"name":"value", "type":"array" , "items":"integers"},
{"name":"attrs" , "type":"attrytpe"}
]
}
,...
]
Here 'attrytpe' is a Map type defined elsewhere and "myrtype" is an enum defined elsewhere.
Similarly for a complex one in the union, it's 'values' field will be an array of "complex type" defined elsewhere?
Woud i need multiple avro files using the same namespace?
or this the serialized the equivalent of what i have before [1]?
Thanks for your time
Saptarshi
[1]
{
"namespace": "robjects.avro",
"type": "record",
"name": "robject",
"doc" : "Encoding of some of the R data types",
"fields": [
{"name":"typeof" ,"type":{"type":"enum", "name":"thetype" ,"symbols": ["NULL","RAW","INTEGER","REAL","COMPLEX","LOGICAL","STRING","LIST","ATTRIBUTES"]}},
{"name":"NAtype" ,"type":{"type":"enum" , "name":"NA" ,"symbols":["NA"]}},
{"name":"complextype","type":{"type":"record" , "name":"complex", "fields":[
{"name":"re", "type":"double"},
{"name":"im", "type":"double"}
]}},
{"name":"NULL" ,"type":"null"},
{"name":"RAW" ,"type":["null",{"type":"array" ,"items":"bytes"}]},
{"name":"INTEGER" ,"type":["null",{"type":"array" ,"items":"int"}]},
{"name":"REAL" ,"type":["null",{"type":"array" ,"items":"double"}]},
{"name":"COMPLEX" ,"type":["null",{"type":"array" ,"items":"complex"}]},
{"name":"LOGICAL" ,"type":["null",{"type":"array" ,"items":["boolean","NA"]}]},
{"name":"STRING" ,"type":["null",{"type":"array" ,"items":["string","NA"]}]},
{"name":"LIST" ,"type":["null",{"type":"array" ,"items":["robject"]}]},
{"name":"ATTRIBUTES" ,"type":["null",{"type":"map" ,"values":"robject"}]}
]
}
----- Original Message -----
From: "Scott Carey" <sc...@apache.org>
To: user@avro.apache.org, "Saptarshi Guha" <jo...@mozilla.com>
Sent: Monday, June 25, 2012 9:42:27 PM
Subject: Re: C/C++ parsing vs. Java parsing.
The schema provided is a union of several schemas. Java supports parsing
this, C++ may not. Does it work if you make it one single schema, and
nest "NA", "acomplex" and "retypes" inside of "object" ? It only needs to
be defined the first time it is referenced. If it does not, then it is
certainly a bug.
Either way I would file a bug in JIRA. The spec does not say whether a
file should be parseable if it contains a union rather than a record, but
it probably should be.
-Scott
On 6/24/12 11:17 PM, "Saptarshi Guha" <sg...@mozilla.com> wrote:
>I have a avro scheme found here: http://sguha.pastebin.mozilla.org/1677671
>
>I tried
>
>java -jar avro-tools-1.7.0.jar compile schema ~/tmp/robject.avro foo
>
>and it worked.
>
>This failed:
>
>avrogencpp --input ~/tmp/robject.avro --output ~/tmp/h2
>Segmentation fault: 11
>
>
>This failed:
>
> avro_schema_t *person_schema =
>(avro_schema_t*)malloc(sizeof(avro_schema_t));
>(avro_schema_from_json_literal(string.of.avro.file), person_schema)
>
>with
>
>Error was Error parsing JSON: string or '}' expected near end of file
>
>Q1: Does C and C++ API support all schemas the Java one supports?
>Q2: Is it yes to Q1 and this is a bug?
>
>Regards
>Saptarshi
Re: C/C++ parsing vs. Java parsing.
Posted by Saptarshi Guha <sg...@mozilla.com>.
Hi Scott,
Thanks for the response. I changed the avro file to [1]
1. Java works.
2. avrocppgen
avrogencpp -i ~/tmp/robject.avro -o foo
works.
3. C
avro_schema_t *person_schema = (avro_schema_t*)malloc(sizeof(avro_schema_t));
(avro_schema_from_json_literal(jsonstring, person_schema))
returns:
Error was Error parsing JSON: string or '}' expected near end of file
So is this a bug? or am i calling it wrong.
Ideally, i would like a union of
["NULL","RAW","INTEGER","REAL","COMPLEX","LOGICAL","STRING","LIST"]}}
Each of these is a record of a 1) a type (might be array of integers, though COMPLEX is array of records)
and (2) another field called Attributes.
e.g
[
{"type":"record",
"name":"REAL",
"fields":[
{"name":"whattype", "type":"myrtype"},
{"name":"value", "type":"array" , "items":"double"},
{"name":"attrs" , "type":"attrytpe"}
]
},
{"type":"record",
"name":"INTEGER",
"fields":[
{"name":"whattype", "type":"myrtype"},
{"name":"value", "type":"array" , "items":"integers"},
{"name":"attrs" , "type":"attrytpe"}
]
}
,...
]
Here 'attrytpe' is a Map type defined elsewhere and "myrtype" is an enum defined elsewhere.
Similarly for a complex one in the union, it's 'values' field will be an array of "complex type" defined elsewhere?
Woud i need multiple avro files using the same namespace?
or this the serialized the equivalent of what i have before [1]?
Thanks for your time
Saptarshi
[1]
{
"namespace": "robjects.avro",
"type": "record",
"name": "robject",
"doc" : "Encoding of some of the R data types",
"fields": [
{"name":"typeof" ,"type":{"type":"enum", "name":"thetype" ,"symbols": ["NULL","RAW","INTEGER","REAL","COMPLEX","LOGICAL","STRING","LIST","ATTRIBUTES"]}},
{"name":"NAtype" ,"type":{"type":"enum" , "name":"NA" ,"symbols":["NA"]}},
{"name":"complextype","type":{"type":"record" , "name":"complex", "fields":[
{"name":"re", "type":"double"},
{"name":"im", "type":"double"}
]}},
{"name":"NULL" ,"type":"null"},
{"name":"RAW" ,"type":["null",{"type":"array" ,"items":"bytes"}]},
{"name":"INTEGER" ,"type":["null",{"type":"array" ,"items":"int"}]},
{"name":"REAL" ,"type":["null",{"type":"array" ,"items":"double"}]},
{"name":"COMPLEX" ,"type":["null",{"type":"array" ,"items":"complex"}]},
{"name":"LOGICAL" ,"type":["null",{"type":"array" ,"items":["boolean","NA"]}]},
{"name":"STRING" ,"type":["null",{"type":"array" ,"items":["string","NA"]}]},
{"name":"LIST" ,"type":["null",{"type":"array" ,"items":["robject"]}]},
{"name":"ATTRIBUTES" ,"type":["null",{"type":"map" ,"values":"robject"}]}
]
}
----- Original Message -----
From: "Scott Carey" <sc...@apache.org>
To: user@avro.apache.org, "Saptarshi Guha" <jo...@mozilla.com>
Sent: Monday, June 25, 2012 9:42:27 PM
Subject: Re: C/C++ parsing vs. Java parsing.
The schema provided is a union of several schemas. Java supports parsing
this, C++ may not. Does it work if you make it one single schema, and
nest "NA", "acomplex" and "retypes" inside of "object" ? It only needs to
be defined the first time it is referenced. If it does not, then it is
certainly a bug.
Either way I would file a bug in JIRA. The spec does not say whether a
file should be parseable if it contains a union rather than a record, but
it probably should be.
-Scott
On 6/24/12 11:17 PM, "Saptarshi Guha" <sg...@mozilla.com> wrote:
>I have a avro scheme found here: http://sguha.pastebin.mozilla.org/1677671
>
>I tried
>
>java -jar avro-tools-1.7.0.jar compile schema ~/tmp/robject.avro foo
>
>and it worked.
>
>This failed:
>
>avrogencpp --input ~/tmp/robject.avro --output ~/tmp/h2
>Segmentation fault: 11
>
>
>This failed:
>
> avro_schema_t *person_schema =
>(avro_schema_t*)malloc(sizeof(avro_schema_t));
>(avro_schema_from_json_literal(string.of.avro.file), person_schema)
>
>with
>
>Error was Error parsing JSON: string or '}' expected near end of file
>
>Q1: Does C and C++ API support all schemas the Java one supports?
>Q2: Is it yes to Q1 and this is a bug?
>
>Regards
>Saptarshi
Re: C/C++ parsing vs. Java parsing.
Posted by Scott Carey <sc...@apache.org>.
The schema provided is a union of several schemas. Java supports parsing
this, C++ may not. Does it work if you make it one single schema, and
nest "NA", "acomplex" and "retypes" inside of "object" ? It only needs to
be defined the first time it is referenced. If it does not, then it is
certainly a bug.
Either way I would file a bug in JIRA. The spec does not say whether a
file should be parseable if it contains a union rather than a record, but
it probably should be.
-Scott
On 6/24/12 11:17 PM, "Saptarshi Guha" <sg...@mozilla.com> wrote:
>I have a avro scheme found here: http://sguha.pastebin.mozilla.org/1677671
>
>I tried
>
>java -jar avro-tools-1.7.0.jar compile schema ~/tmp/robject.avro foo
>
>and it worked.
>
>This failed:
>
>avrogencpp --input ~/tmp/robject.avro --output ~/tmp/h2
>Segmentation fault: 11
>
>
>This failed:
>
> avro_schema_t *person_schema =
>(avro_schema_t*)malloc(sizeof(avro_schema_t));
>(avro_schema_from_json_literal(string.of.avro.file), person_schema)
>
>with
>
>Error was Error parsing JSON: string or '}' expected near end of file
>
>Q1: Does C and C++ API support all schemas the Java one supports?
>Q2: Is it yes to Q1 and this is a bug?
>
>Regards
>Saptarshi