You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Allen Wang (Jira)" <ji...@apache.org> on 2020/06/10 17:30:00 UTC

[jira] [Commented] (AVRO-2799) Endless recursive calls by using ProtobufDatumReader when decoding proto map

    [ https://issues.apache.org/jira/browse/AVRO-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17132533#comment-17132533 ] 

Allen Wang commented on AVRO-2799:
----------------------------------

hello [~rskraba]:

Verified against the 1.11.0 snapshot. The endless loop is gone. However, it still throws exception:
{code:java}
Exception in thread "main" java.lang.ClassCastException: class org.apache.avro.generic.GenericData$Record cannot be cast to class com.google.protobuf.MessageOrBuilder (org.apache.avro.generic.GenericData$Record and com.google.protobuf.MessageOrBuilder are in unnamed module of loader 'app')Exception in thread "main" java.lang.ClassCastException: class org.apache.avro.generic.GenericData$Record cannot be cast to class com.google.protobuf.MessageOrBuilder (org.apache.avro.generic.GenericData$Record and com.google.protobuf.MessageOrBuilder are in unnamed module of loader 'app') at org.apache.avro.protobuf.ProtobufData.getRecordState(ProtobufData.java:120) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:237) at org.apache.avro.protobuf.ProtobufDatumReader.readRecord(ProtobufDatumReader.java:62) at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179) at org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:298) at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:183) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160) at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:259) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247) at org.apache.avro.protobuf.ProtobufDatumReader.readRecord(ProtobufDatumReader.java:62) at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) at com.example.EventTest.main(EventTest.java:53)
{code}
I can confirm that the exception is related to the data in the protobuf map. Once I commented out the code for adding an entry to the map, the exception is gone.
{code:java}
.putPayload("key", Any.newBuilder().setValue(ByteString.copyFromUtf8("xyz")).build())
{code}
 

When will this issue be addressed? This is a blocker for us to use Avro as an encoding format for data transport.

Also I would like to know if decoding from Avro to Protobuf is generally supported or a case by case situation depending on the protobuf definition.

Thanks!

> Endless recursive calls by using ProtobufDatumReader when decoding proto map
> ----------------------------------------------------------------------------
>
>                 Key: AVRO-2799
>                 URL: https://issues.apache.org/jira/browse/AVRO-2799
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.9.2
>            Reporter: Allen Wang
>            Priority: Major
>
> I have a protobuf message defined as this:
>  
> {code:java}
> syntax = "proto3";
> package ExampleProtobuf;
> option java_package = "com.example";
> import "google/protobuf/any.proto";
> import "google/protobuf/timestamp.proto";
> import "google/protobuf/wrappers.proto";
> enum EventType {
>   TYPE_UNKNOWN = 0;
>   TYPE_ORDER_RECEIVED = 1;
>   TYPE_PAYMENT_AUTHORIZED = 2;
>   TYPE_PAYMENT_CAPTURED = 3;
>   TYPE_ORDER_SUBMITTED = 4;
>   TYPE_ORDER_DELIVERY_CREATED = 5;
>   TYPE_MERCHANT_CONFIRM_ORDER = 6;
> }
> message RevenueEvent {
>   google.protobuf.StringValue id = 1;
>   EventType type = 2;
>   google.protobuf.Timestamp change_time = 3;
>   google.protobuf.StringValue country = 4;
>   map <string, google.protobuf.Any> payload = 5;
>   google.protobuf.StringValue source = 6;
>   bool is_test = 7;
>   google.protobuf.StringValue version = 8;
>   google.protobuf.StringValue checksum = 9;
> }
> {code}
>  
>  
> I wrote a program to encode such Protobuf message with Avro schema and convert it back to Protobuf message:
>  
>  
> {code:java}
> public class EventTest {
>   public static void main(String[] args) throws  Exception {
>     Example.RevenueEvent event = RevenueEvent.newBuilder()
>             .setId(StringValue.newBuilder()
>                     .setValue("12345")
>                     .build())
>             .setChangeTime(Timestamp.newBuilder()
>                     .setSeconds(System.currentTimeMillis())
>                     .build())
>             .setChecksum(StringValue.newBuilder()
>                     .setValue("xyz")
>                     .build())
>             .setCountry(StringValue.newBuilder()
>                     .setValue("US")
>                     .build())
>             .setSource(StringValue.newBuilder()
>                     .setValue("unknown")
>                     .build())
>             .setType(EventType.TYPE_MERCHANT_CONFIRM_ORDER)
>             .putPayload("key", Any.newBuilder().setValue(ByteString.copyFromUtf8("xyz")).build())
>             .build();
>     ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
>     Encoder encoder = EncoderFactory.get().directBinaryEncoder(outputStream, null);
>     GenericDatumWriter<Example.RevenueEvent> datumWriter = new ProtobufDatumWriter<Example.RevenueEvent>(RevenueEvent.class);
>     datumWriter.write(event, encoder);
>     encoder.flush();
>     byte[] bytes = outputStream.toByteArray();
>     DatumReader<RevenueEvent> datumReader = new ProtobufDatumReader<>(RevenueEvent.class);
>     InputStream inputStream = new ByteArrayInputStream(bytes);
>     Decoder decoder = DecoderFactory.get().binaryDecoder(inputStream, null);
>     datumReader.read(null, decoder);
>   }
> }
> {code}
>  
>  
> The program crashed with java.lang.StackOverflowError:
>  
> Exception in thread "main" java.lang.StackOverflowErrorException in thread "main" java.lang.StackOverflowError at java.base/java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) at org.apache.avro.specific.SpecificData.getClass(SpecificData.java:250) at org.apache.avro.protobuf.ProtobufData.newRecord(ProtobufData.java:141) at org.apache.avro.protobuf.ProtobufData.newRecord(ProtobufData.java:143) at org.apache.avro.protobuf.ProtobufData.newRecord(ProtobufData.java:143) at org.apache.avro.protobuf.ProtobufData.newRecord(ProtobufData.java:143)
> ...
>  
> Looking into ProtoData.java, there seems to be a situation where the recursion would not end:
>  
> {code:java}
> @Override
> public Object newRecord(Object old, Schema schema) {
>   try {
>     Class c = SpecificData.get().getClass(schema);
>     if (c == null)
>       return newRecord(old, schema); // punt to generic <-- endless recursion
>     if (c.isInstance(old))
>       return old; // reuse instance
>     return c.getMethod("newBuilder").invoke(null);
>   } catch (Exception e) {
>     throw new RuntimeException(e);
>   }
> }
> {code}
> Further experiment shows that if the map payload is not set, the decoding would be successful. 
>  
> protobuf-java:3.8.0 is used.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)