You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Priit Danelson <pr...@bwt.ee> on 2019/10/15 11:58:55 UTC

avro.java.string property being ignored during deserialization when nested inside unions

The error can be reproduced with the following simplified schemas (with the only difference being the presence of the “avro.java.string” property in the reader schema and its absence in the writer schema):

Reader:
{
  "type": "record",
  "name": "Event",
  "namespace": "com.example.event.model",
  "fields": [
    {
      "name": "body",
      "type": {
        "type": "record",
        "name": "EventBody",
        "namespace": "com.example.event.model",
        "fields": [
          {
            "name": "optionalNestedObject",
            "type": [
              "null",
              {
                "type": "record",
                "name": "NestedObject",
                "fields": [
                  {
                    "name": "mandatoryString",
                    "type": {
                      "type": "string",
                      "avro.java.string": "String"
                    }
                  }
                ]
              }
            ],
            "default": null
          }
        ]
      }
    }
  ]
}

Writer:
{
  "type": "record",
  "name": "Event",
  "namespace": "com.example.event.model",
  "fields": [
    {
      "name": "body",
      "type": {
        "type": "record",
        "name": "EventBody",
        "namespace": "com.example.event.model",
        "fields": [
          {
            "name": "optionalNestedObject",
            "type": [
              "null",
              {
                "type": "record",
                "name": "NestedObject",
                "fields": [
                  {
                    "name": "mandatoryString",
                    “type": "string"
                  }
                ]
              }
            ],
            "default": null
          }
        ]
      }
    }
  ]
}

The issue seems to be caused by the org.apache.avro.Resolver.unionEquiv() not considering properties when determining whether the “optionalNestedObject” union is equal in both schemas and the org.apache.avro.io.ResolvingGrammarGenerator.generate() then using the writer schema to determine the String class to use for the string field within that union since the union is falsely considered equivalent. This ultimately results in a ClassCastException since the string is deserialized as org.apache.avro.util.Utf8 but the POJO generated from the reader schema has a field with the correct java.lang.String type.