You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Mr rty ff <ya...@yahoo.com> on 2016/04/18 16:36:59 UTC

Deserializing with changing schema.

HelloBased on Avro schema I generated a class (Data) to work with the class appropriate to the schema
After it I encode the data and send in to other application "A" using kafka

Data data; // <- The object was initialized before . Here it is only the declaration "for example"
EncoderFactory encoderFactory = EncoderFactory.get();
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        BinaryEncoder encoder = encoderFactory.directBinaryEncoder(out, null);

                        
         DatumWriter<Tloog> writer;
                            

        writer = new SpecificDatumWriter<Data>(Data.class);

        writer.write(data, encoder);
        byte[] avroByteMessage = out.toByteArray();


On the other side (in the application "A") I deserilize the the data by implementing Deserializer
class DataDeserializer implements Deserializer<Data> {
    private String encoding = "UTF8";

    @Override
    public void configure(Map<String, ?> configs, boolean isKey) {
        // nothing to do
    }

    @Override
    public Tloog deserialize(String topic, byte[] data) {
        try {
            if (data == null)
            {
                return null;
            }
            else
            {
                        DatumReader<Tloog> reader = new SpecificDatumReader<Data>(Data.class);
                        DecoderFactory decoderFactory = DecoderFactory.get();
                        BinaryDecoder decoder = decoderFactory.binaryDecoder(data, null);
                        Data decoded = reader.read(null, decoder);
                        return decoded;
            }
        } catch (Exception e) {
            throw new SerializationException("Error when deserializing byte[] to string due to unsupported encoding " + encoding);
        }
    }
The problem is that this approach requires the use of SpecificDatumReader<Data>, I.e.the Data class should be integrated  with the application code...This could be problematic - schema could change and therefore Data class should be re-generated and integrated once more
2 questions:   
   - Should I use GenericDatumReader in the application? How to do that correctly. (I can save the schema simply in the application)
   - Is there a simple way to work with SpecificDatumReader if Data changes? How could it be integrated with out much trouble?
Thanks