You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Javier Holguera <ja...@gmail.com> on 2019/12/02 14:57:51 UTC

"class org.apache.avro.util.Utf8 cannot be cast to class java.lang.String" error

Hi,

I'm getting this error message when trying to deserialize a Kafka record
with Avro format.

The producer is a Kafka Connect connector called Debezium. Looking at
Schema Registry, the writer schema looks like this:

{
   "type":"record",
   "name":"Envelope",

 "namespace":"com.nutmeg.shared.cdc_admin_actions.nutmeglive.T_ADM_DELAYEDACTION",
   "fields":[
      {
         "name":"before",
         "type":[
            "null",
            {
               "type":"record",
               "name":"Value",
               "fields":[
                  {
                     "name":"UUID",
                     "type":"string"
                  },
                  {
                     "name":"ACTION",
                     "type":[
                        "null",
                        "string"
                     ],
                     "default":null
                  },
                  {
                     "name":"CREATEDAT",
                     "type":[
                        "null",
                        "long"
                     ],
                     "default":null
                  },
[ REDACTED FOR CLARITY ]

   ],
   "connect.name
":"com.nutmeg.shared.cdc_admin_actions.nutmeglive.T_ADM_DELAYEDACTION.Envelope"
}


On the consumer side, I'm using the Maven Avro plugin to generate a bunch
of Java POJOs. Looking at the POJO for Value, which is the class that blows
up, I can see this reader schema:

{
   "type":"record",
   "name":"Envelope",

 "namespace":"com.nutmeg.shared.cdc_admin_actions.nutmeglive.T_ADM_DELAYEDACTION",
   "fields":[
      {
         "name":"before",
         "type":[
            "null",
            {
               "type":"record",
               "name":"Value",
               "fields":[
                  {
                     "name":"UUID",
                     "type":{
                        "type":"string",
                        "avro.java.string":"String"
                     }
                  },
                  {
                     "name":"ACTION",
                     "type":[
                        "null",
                        {
                           "type":"string",
                           "avro.java.string":"String"
                        }
                     ],
                     "default":null
                  },
                  {
                     "name":"CREATEDAT",
                     "type":[
                        "null",
                        "long"
                     ],
                     "default":null
                  },

[ REDACTED FOR CLARITY ]

   "connect.name
":"com.nutmeg.shared.cdc_admin_actions.nutmeglive.T_ADM_DELAYEDACTION.Envelope"
}

Notice that the String values all have these "hints" for Java
("avro.java.string":"String") to understand how to do the deserialization.
The problem is I get this runtime exception:

Caused by: org.apache.kafka.common.errors.SerializationException: Error
deserializing Avro message for id 15
Caused by: java.lang.ClassCastException: class org.apache.avro.util.Utf8
cannot be cast to class java.lang.String (org.apache.avro.util.Utf8 is in
unnamed module of loader 'app'; java.lang.String is in module java.base of
loader 'bootstrap')
at
com.nutmeg.portfolio.cdc_fund_details.nutmeglive.T_WEB_FUND.Value.put(Value.java:393)
~[classes/:na]
at org.apache.avro.generic.GenericData.setField(GenericData.java:795)
~[avro-1.9.1.jar:1.9.1]
at
org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:139)
~[avro-1.9.1.jar:1.9.1]
at
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:237)
~[avro-1.9.1.jar:1.9.1]
at
org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
~[avro-1.9.1.jar:1.9.1]
at
org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:170)
~[avro-1.9.1.jar:1.9.1]
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
~[avro-1.9.1.jar:1.9.1]
at
org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:178)
~[avro-1.9.1.jar:1.9.1]
at
org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:136)
~[avro-1.9.1.jar:1.9.1]
at
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:237)
~[avro-1.9.1.jar:1.9.1]
at
org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
~[avro-1.9.1.jar:1.9.1]
at
org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:170)
~[avro-1.9.1.jar:1.9.1]
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
~[avro-1.9.1.jar:1.9.1]
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
~[avro-1.9.1.jar:1.9.1]
at
io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:125)
~[kafka-avro-serializer-5.3.0.jar:na]
at
io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:79)
~[kafka-avro-serializer-5.3.0.jar:na]
at
io.confluent.kafka.serializers.KafkaAvroDeserializer.deserialize(KafkaAvroDeserializer.java:55)
~[kafka-avro-serializer-5.3.0.jar:na]
at
io.confluent.kafka.streams.serdes.avro.SpecificAvroDeserializer.deserialize(SpecificAvroDeserializer.java:66)
~[kafka-streams-avro-serde-5.3.0.jar:na]
at
io.confluent.kafka.streams.serdes.avro.SpecificAvroDeserializer.deserialize(SpecificAvroDeserializer.java:38)
~[kafka-streams-avro-serde-5.3.0.jar:na]
at
org.apache.kafka.common.serialization.Deserializer.deserialize(Deserializer.java:58)
~[kafka-clients-2.2.1.jar:na]
at
org.apache.kafka.streams.processor.internals.SourceNode.deserializeValue(SourceNode.java:60)
~[kafka-streams-2.2.1.jar:na]
at
org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:66)
~[kafka-streams-2.2.1.jar:na]
at
org.apache.kafka.streams.processor.internals.RecordQueue.maybeUpdateTimestamp(RecordQueue.java:160)
~[kafka-streams-2.2.1.jar:na]
at
org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:101)
~[kafka-streams-2.2.1.jar:na]
at
org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:136)
~[kafka-streams-2.2.1.jar:na]
at
org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:744)
~[kafka-streams-2.2.1.jar:na]
at
org.apache.kafka.streams.processor.internals.StreamThread.addRecordsToTasks(StreamThread.java:1022)
~[kafka-streams-2.2.1.jar:na]
at
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:860)
~[kafka-streams-2.2.1.jar:na]
at
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:804)
~[kafka-streams-2.2.1.jar:na]
at
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:773)
~[kafka-streams-2.2.1.jar:na]

My understanding of the problem is:

1. The producer (which is using Avro 1.8.1) is serializing Strings using
the Utf8 format by default.
2. The consumer (which is using Avro 1.9.1) is deserializing Strings using
the String format, as instructed by the reader schema (via the Maven Avro
plugin config).

I am aware that I could fix this changing how the Consumer Avro POJOs are
generated. However, that would force me to deal with all the corresponding
POJO properties as avro.util.Utf8 classes, which is a hassle considering
there is already a "String" class in Java...

Also, it seems that downgrading the Avro deps (serializer and Maven plugin)
to 1.8.2 on the consumer side fixes the problem, suggesting some sort of
regression / backward incompatible change between 1.8.2. and 1.9.1.

Is this a known issue? Is there a way around it that doesn't imply changing
the POJO classes?

Thanks.

Regards,
Javier.