You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Erwin Speybroeck <er...@crv4all.com> on 2020/03/31 06:51:41 UTC
AVRO definition question - record within a record?
Hi,
I need to be able to make a POST call to an API and the body should look like this :
{
"location" : "355669",
"countryCode" : "NL",
"identificationNumber" : "NL 672760327",
"externalId" : "KTSPRED_01_817997491",
"dateTime" : "2019-11-08T04:33:41.000Z",
"value" : "GEMIDDELD_RISICO",
"type" : "ketosis_prediction",
"additionalInformation" : "{
"calvingDate": "2018-10-01",
"parity": "3",
"create_date": "2019-11-08T04:33:41.000Z "
}"
}
I tried the following AVRO definition for serialisation (starting from a csv file) :
{
"type" : "record",
"name" : "person",
"namespace" : "nifi",
"fields" : [{"name" : "location" ,
"type" : "int"},
{"name" : "country" ,
"type" : "string"},
{"name" : "animal_number" ,
"type" : "string"},
{"name" : "alert_id" ,
"type" : "string"},
{"name" : "alert_date" ,
"type" : "string"},
{"name" : "type_of_alert" ,
"type" : "string"},
{"name" : "alert_name" ,
"type" : "string"},
{"name" : "additionalInformation",
"type" : {
"type" : "record",
"name" : "test",
"fields" : [
{"name" : "calving_date",
"type" : "string"},
{"name" : "parity",
"type" : "string"},
{"name" : "create_dtm_dl",
"type" : "string"}
]},
"default" : {}
}
]
}
But it does not work. Is it possible to define a new record within a record? Or should it be done in another way?
My hive tables are in CSV and I have to convert them to JSON so I can post them.
To create this JSON I have to use an AVRO schema. It works fine until the field "additionalInformation".
I'm not able to generate the fields inside additionalInformation, the only thing I can do is to say that additionalInformation is a string. But then it doesn't create the fields that I want and it doesn't post it.
ABove is my AVRO schema trying to create the JSON. The BOLD part is the one trying to create the additionalInformation field as a record, but it doesn't work and I have to change the type to string so that it works, but then the POST body is not json.
The csv file looks like this - maybe I need to change this input file in some way?
alert_name;animal_number;country;location;alert_id;type_of_alert;alert_date;calving_date;parity;create_dtm_dl
"ketosis_prediction";"NL 743169121";"NL";83618;"KTSPRED_01_817997482";"HOOG_RISICO";"2019-11-08 04:33:38.0";2019-11-07 00:00:00.0;4;2019-11-09 19:13:29.484
"ketosis_prediction";"NL 672760327";"NL";355669;"KTSPRED_01_817997491";"GEMIDDELD_RISICO";"2019-11-08 04:33:41.0";2019-11-07 00:00:00.0;3;2019-11-09 19:13:29.484
Met vriendelijke groet, Kind regards, S pozdravem, Freundlichen Grüßen, Atenciosamente,
Erwin Speybroeck
Lead Business Consultant | BU Data
[cid:image001.png@01D60735.64135CE0]
[cid:image002.png@01D60735.64135CE0] (0)26-3898621
[cid:image003.png@01D60735.64135CE0] 0032475-252401
[cid:image004.png@01D60735.64135CE0] erwin.speybroeck@crv4all.com<ma...@crv4all.com>
This message is subject to the following E-mail Disclaimer. (http://www.crv4all.com/disclaimer-email/) CRV Holding B.V. seats according to the articles of association in Arnhem, Dutch trade number 09125050.
Re: AVRO definition question - record within a record?
Posted by fa...@legsem.com.
This code, using your schema:
Schema schema = new Schema.Parser().parse(new
File("src/test/data/nestedrecord.avsc"));
JsonEncoder out = EncoderFactory.get().jsonEncoder(schema, System.out,
true);
DatumWriter<Object> writer = new GenericDatumWriter<>(schema);
GenericRecord person = new GenericData.Record(schema);
person.put("location", 5);
person.put("country", "TH");
person.put("animal_number", "7");
person.put("alert_id", "ab1");
person.put("alert_date", "2014-12-05");
person.put("type_of_alert", "tu");
person.put("alert_name", "zu");
GenericRecord test = new
GenericData.Record(schema.getField("additionalInformation").schema());
test.put("calving_date", "2014-12-05");
test.put("parity", "p");
test.put("create_dtm_dl", "12:12:12");
person.put("additionalInformation", test);
writer.write(person, out);
out.flush();
Produces this result:
{
"location" : 5,
"country" : "TH",
"animal_number" : "7",
"alert_id" : "ab1",
"alert_date" : "2014-12-05",
"type_of_alert" : "tu",
"alert_name" : "zu",
"additionalInformation" : {
"calving_date" : "2014-12-05",
"parity" : "p",
"create_dtm_dl" : "12:12:12"
}
}
So nested records are properly supported in avro and widely used.
Maybe something wrong in your code you are using?
Cheers
On 31.03.2020 08:51, Erwin Speybroeck wrote:
> Hi,
>
> I need to be able to make a POST call to an API and the body should look like this :
>
> {
>
> "location" : "355669",
>
> "countryCode" : "NL",
>
> "identificationNumber" : "NL 672760327",
>
> "externalId" : "KTSPRED_01_817997491",
>
> "dateTime" : "2019-11-08T04:33:41.000Z",
>
> "value" : "GEMIDDELD_RISICO",
>
> "type" : "ketosis_prediction",
>
> "additionalInformation" : "{
>
> "calvingDate": "2018-10-01",
>
> "parity": "3",
>
> "create_date": "2019-11-08T04:33:41.000Z "
>
> }"
>
> }
>
> I tried the following AVRO definition for serialisation (starting from a csv file) :
>
> {
>
> "type" : "record",
>
> "name" : "person",
>
> "namespace" : "nifi",
>
> "fields" : [{"name" : "location" ,
>
> "type" : "int"},
>
> {"name" : "country" ,
>
> "type" : "string"},
>
> {"name" : "animal_number" ,
>
> "type" : "string"},
>
> {"name" : "alert_id" ,
>
> "type" : "string"},
>
> {"name" : "alert_date" ,
>
> "type" : "string"},
>
> {"name" : "type_of_alert" ,
>
> "type" : "string"},
>
> {"name" : "alert_name" ,
>
> "type" : "string"},
>
> {"name" : "additionalInformation",
>
> "type" : {
>
> "type" : "record",
>
> "name" : "test",
>
> "fields" : [
>
> {"name" : "calving_date",
>
> "type" : "string"},
>
> {"name" : "parity",
>
> "type" : "string"},
>
> {"name" : "create_dtm_dl",
>
> "type" : "string"}
>
> ]},
>
> "default" : {}
>
> }
>
> ]
>
> }
>
> But it does not work. Is it possible to define a new record within a record? Or should it be done in another way?
>
> My hive tables are in CSV and I have to convert them to JSON so I can post them.
>
> To create this JSON I have to use an AVRO schema. It works fine until the field "additionalInformation".
>
> I'm not able to generate the fields inside additionalInformation, the only thing I can do is to say that additionalInformation is a string. But then it doesn't create the fields that I want and it doesn't post it.
>
> ABove is my AVRO schema trying to create the JSON. The BOLD part is the one trying to create the additionalInformation field as a record, but it doesn't work and I have to change the type to string so that it works, but then the POST body is not json.
>
> The csv file looks like this - maybe I need to change this input file in some way?
>
> alert_name;animal_number;country;location;alert_id;type_of_alert;alert_date;calving_date;parity;create_dtm_dl
>
> "ketosis_prediction";"NL 743169121";"NL";83618;"KTSPRED_01_817997482";"HOOG_RISICO";"2019-11-08 04:33:38.0";2019-11-07 00:00:00.0;4;2019-11-09 19:13:29.484
>
> "ketosis_prediction";"NL 672760327";"NL";355669;"KTSPRED_01_817997491";"GEMIDDELD_RISICO";"2019-11-08 04:33:41.0";2019-11-07 00:00:00.0;3;2019-11-09 19:13:29.484
>
> Met vriendelijke groet, Kind regards, S pozdravem, Freundlichen Grüßen, Atenciosamente,
>
> Erwin Speybroeck
>
> _Lead Business Consultant | BU Data_
>
> (0)26-3898621
>
> 0032475-252401
>
> erwin.speybroeck@crv4all.com
>
> This message is subject to the following E-mail Disclaimer. (http://www.crv4all.com/disclaimer-email/) CRV Holding B.V. seats according to the articles of association in Arnhem, Dutch trade number 09125050.