You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Steve Stagg (Jira)" <ji...@apache.org> on 2023/08/24 10:46:00 UTC

[jira] [Created] (AVRO-3843) [Python] bytes field default values are incorrectly encoded

Steve Stagg created AVRO-3843:
---------------------------------

             Summary: [Python] bytes field default values are incorrectly encoded
                 Key: AVRO-3843
                 URL: https://issues.apache.org/jira/browse/AVRO-3843
             Project: Apache Avro
          Issue Type: Bug
    Affects Versions: 1.11.2, 1.12.0
            Reporter: Steve Stagg


Record fields of type 'bytes' currently utf-8 encode (the call is to '<string>'.encode() which defaults to using utf8) their default values, which (as far as I can tell) is incorrect.
This means that if you have a bytes field with a default value of "\u00ff\u00ff", then if the defaul is used during decoding, the value b'\xc3\xbf\xc3\xbf' is returned, rather than the expected b'\xff\xff'

Avro < 1.11 appears to do the correct thing here.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)