You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Thiruvalluvan M. G. (JIRA)" <ji...@apache.org> on 2017/10/11 01:20:00 UTC
[jira] [Commented] (AVRO-2095) Avro 1.8.2 encode in c++ -
java.lang.ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/AVRO-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199665#comment-16199665 ]
Thiruvalluvan M. G. commented on AVRO-2095:
-------------------------------------------
There are a couple of basic errors:
* The top-level schema you have defined is a Union schema. It is a union between {{Event}} and {{MyDevice}}. If wanted to produce a file with events, you should define a top-level schema for {{Event}}. Here is a way to do it:
{code}
{
"namespace": "com.test",
"name": "Event",
"type": "record",
"doc": "event",
"fields": [
{
"name": "myDevice",
"type": [
"null",
{
"namespace": "com.test",
"name": "MyDevice",
"type": "record",
"doc": "client device",
"fields": [
{
"name": "deviceId",
"type": [
"null",
"string"
],
"default": null,
"doc": "Usually unique MAC address"
}
]
}
],
"default": null,
"doc": "Device information"
}
]
}
{code}
* Having done that you need to create Avro Data File in order to read it using avro-tools jar. What you have done is to simply write the binary Avro into memory. Here is an example of writing Data File.
{code}
#include "tst.h"
#include "DataFile.hh"
#include "Compiler.hh"
int main()
{
std::string schemaStr = std::string(
" {"
" \"namespace\": \"com.test\","
" \"name\": \"Event\","
" \"type\": \"record\","
" \"doc\": \"event\","
" \"fields\": ["
" {"
" \"name\": \"myDevice\","
" \"type\": ["
" \"null\","
" {"
" \"namespace\": \"com.test\","
" \"name\": \"MyDevice\","
" \"type\": \"record\","
" \"doc\": \"client device\","
" \"fields\": ["
" {"
" \"name\": \"deviceId\","
" \"type\": ["
" \"null\","
" \"string\""
" ],"
" \"default\": null,"
" \"doc\": \"Usually unique MAC address\""
" }"
" ]"
" }"
" ],"
" \"default\": null,"
" \"doc\": \"Device information\""
" }"
" ]"
" }"
);
avro::ValidSchema schema = avro::compileJsonSchemaFromString(schemaStr);
avro::DataFileWriter<Event> w("a.avro", schema, 16 * 1024, avro::DEFLATE_CODEC);
MyDevice device;
device.deviceId.set_string("device1");
Event event;
event.myDevice.set_MyDevice(device);
w.write(event);
}
{code}
The code above assumes that your generated header file is {{tst.h}}. It creates a file called {{a.avro}}. No if you try to read the file using avro-tools jar, you should be able to see your record.
> Avro 1.8.2 encode in c++ - java.lang.ArrayIndexOutOfBoundsException
> --------------------------------------------------------------------
>
> Key: AVRO-2095
> URL: https://issues.apache.org/jira/browse/AVRO-2095
> Project: Avro
> Issue Type: New Feature
> Components: c++, java
> Affects Versions: 1.8.2
> Environment: C++, Java
> Reporter: Karthik
> Labels: newbie
>
> I have the following schema
> {code:json}
> [
> {
> "namespace": "com.test",
> "name": "MyDevice",
> "type": "record",
> "doc": "client device",
> "fields": [
> {
> "name": "deviceId",
> "type": [
> "null",
> "string"
> ],
> "default": null,
> "doc": "Usually unique MAC address"
> }
> ]
> },
> {
> "namespace": "com.test",
> "name": "Event",
> "type": "record",
> "doc": "event",
> "fields": [
> {
> "name": "myDevice",
> "type": [
> "null",
> "com.test.MyDevice"
> ],
> "default": null,
> "doc": "Device information"
> }
> ]
> }
> ]
> {code}
> I installed avro 1.8.2 on my ubuntu build machine and generated test.h using avrogencpp tool.
> Then, I created binary encoded avro data as follows:
> {code:c++}
> MyDevice device;
> device.deviceId.set_string("device1");
> Event event;
> event.myDevice.set_MyDevice(device);
> std::vector<char> bytes;
> std::auto_ptr<avro::OutputStream> out = avro::memoryOutputStream(1);
> avro::EncoderPtr e = avro::binaryEncoder();
> e->init(*out);
> avro::encode(*e, event);
> out->flush();
> {code}
> I deserialize my data in Java application as follows:
> {code:java}
> Schema schema = SchemaUtils.getSchemaFromFile("src/main/resources/schemas/test.avsc");
> DatumReader<GenericRecord> genericDatumReader = new GenericDatumReader<>(schema);
> Decoder decoder = DecoderFactory.get().binaryDecoder(data, null);
> try {
> GenericRecord userData = genericDatumReader.read(null, decoder);
> System.out.println(userData);
> } catch (IOException e) {
> e.printStackTrace();
> }
> {code}
> And the result is
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 7
> at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:424)
> at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:290)
> at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
> at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:267)
> at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
> at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
> at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:232)
> at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
> at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
> at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
> at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
> at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
> at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:232)
> at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
> at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
> at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
> at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
> at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
> at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145)
> {noformat}
> But if I do the same using a simple schema (without union), it works perfectly
> {code:json}
> {
> "namespace": "com.test",
> "name": "MyDevice",
> "type": "record",
> "doc": "client device",
> "fields": [
> {
> "name": "deviceId",
> "type": [
> "null",
> "string"
> ],
> "default": null,
> "doc": "Usually unique MAC address"
> }
> ]
> }
> {code}
> Any help appreciated ! Thanks !
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)