You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Martin Jubelgas (JIRA)" <ji...@apache.org> on 2018/12/21 18:13:00 UTC

[jira] [Commented] (AVRO-2160) Json to Avro with non required value and union schema failing

    [ https://issues.apache.org/jira/browse/AVRO-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16726953#comment-16726953 ] 

Martin Jubelgas commented on AVRO-2160:
---------------------------------------

Hi, Lydie,

while I cannot quite make sense of your input data "str str4", I assume that the problem you describe is one I've seen other people make. When instantiating a json decoder, you need to supply the schema that the data was WRITTEN with, not the schema you want to read data with. The default value ("null") is used when the field does not exist in the writer's schema, but does in the reader's. If your input data does not contain the field "lastname", then the writer schema needs to reflect that. In avro, there is no such thing as "not required fields". There are fields that are "union \{null, something}", but those need to be specified when using the GenericDatumReader if I am not mistaken.

If you want to read json data with "non-required fields", you will need to write your own reader (tho that's not too hard) that forgoes the ability of schema evolution but might be more flexible in the handling of "non-required"/defaultable fields.

That said, I'd say your behaviour is not a bug, therefore, I'd suggest closing this ticket.

Regards,

Martin

> Json to Avro with non required value and union schema failing
> -------------------------------------------------------------
>
>                 Key: AVRO-2160
>                 URL: https://issues.apache.org/jira/browse/AVRO-2160
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.8.2
>            Reporter: Lydie
>            Priority: Critical
>              Labels: java
>
> I am trying to convert this string:
> str str4
> using this schema:
> {"type":"record", "namespace":"foo","name":"Person","fields":[\\{"name":"lastname","type": ["null","string"], "default":null}
> ,\{"name":"firstname","type":"string"},{"name":"age","type":["null","int"], "default":null}]}
> I get this error 
> {color:#ff0000}com.syapse.messagePublisher.publisher.AvroEncodeException: Expected field name not found: lastnamein\{"firstname":"John","age":{"int":35}}{color}at com.syapse.messagePublisher.publisher.AvroEncoder.convertJsonToAvro(AvroEncoder.java:78)
>  
> Although this should give me the correct syntax for a non required filed.
> Note that it works for 
> {"lastname":\\{"string" : "Doe"}
> ,"firstname":"John","age":\{"int":36}}
>  
> What am I missing ( using Abro 1.8.2)
> here is my code:
>  
> {code:java}
> public static byte[] convertJsonToAvro(byte[] data, String schemaStr) throws AvroEncodeException {
> InputStream input = null;
> DataFileWriter<GenericRecord> writer = null;
> ByteArrayOutputStream output = null;
> try {
> Schema schema = new Schema.Parser().parse(schemaStr);
> DatumReader<GenericRecord> reader = new GenericDatumReader<GenericRecord>(schema);
> input = new ByteArrayInputStream(data);
> DataInputStream din = new DataInputStream(input);
> output = new ByteArrayOutputStream();
> writer = new DataFileWriter<GenericRecord>(new GenericDatumWriter<GenericRecord>());
> writer.create(schema, output);
> Decoder decoder = DecoderFactory.get().jsonDecoder(schema, din);
> GenericRecord datum = null;
> while (true) {
> try {
> datum = reader.read(null, decoder);
> } catch (EOFException eofe) {
> break;
> }
> writer.append(datum);
> }
> writer.flush();
> writer.close();
> return output.toByteArray();
> } catch (AvroTypeException e) {
> throw new AvroEncodeException(e.getMessage() + "in" + new String(data));
> } catch (IOException e1) {
> throw new AvroEncodeException("Error decoding Json " + e1.getMessage());
> } finally {
> try {
> input.close();
> } catch (Exception e) {
> }
> }
> }
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)