You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Roman Mitasov (Jira)" <ji...@apache.org> on 2024/03/26 12:37:00 UTC

[jira] [Created] (AVRO-3966) toString generates poorly formatted defaults for bytes

Roman Mitasov created AVRO-3966:
-----------------------------------

             Summary: toString generates poorly formatted defaults for bytes
                 Key: AVRO-3966
                 URL: https://issues.apache.org/jira/browse/AVRO-3966
             Project: Apache Avro
          Issue Type: Bug
            Reporter: Roman Mitasov


Schema#toString and Protocol#toString both generate default values for "bytes" and "fixed" types.

According to docs:
{quote}Default values for bytes and fixed fields are JSON strings, where Unicode code points 0-255 are mapped to unsigned 8-bit byte values 0-255. Avro encodes a field even if its value is equal to its default.
{quote}
The following schema
{code:json}
{
  "type" : "record",
  "name" : "TestRecord",
  "fields" : [ {
    "name" : "testFixed",
    "type" : {
      "type" : "fixed",
      "name" : "Code",
      "size" : 3
    },
    "default" : "\u0009\u0020\u00FF"
  } ]
}{code}

If parsed and then again encoded to JSON would have {{"\t ÿ"}} value in "default".

It happens because `toString` implementations use `JsonGenerator` with default escape configs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)