You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@avro.apache.org by GitBox <gi...@apache.org> on 2021/05/02 14:57:50 UTC

[GitHub] [avro] tjwp edited a comment on pull request #1082: AVRO-3036: add schema matching for bytes decimal logical type in Ruby

tjwp edited a comment on pull request #1082:
URL: https://github.com/apache/avro/pull/1082#issuecomment-817199150

@jturkel @ziggythehamster @andrewthauer I've gone around and around a couple of times on how I think this matching should work.

I'd appreciate it if you could take another look at this code, but I'll outline what I settled on and why:

1. Decimal logical types (whether bytes of fixed) now match if they have the same precision and scale. This was the issue caught in the previous review (thank you!).
2. Bytes schemas will always match, and the decimal logical type is ignored for matching. I made this decision to ensure that encoded data can always be read. I believe that this is consistent with the specification that "If a logical type is invalid ..., then implementations should ignore the logical type and use the underlying Avro type". For example, the Ruby implementation has included support for parsing decimal logical types before support encoding/decoding. Making this choice allows encoded data written when the decimal logical type was ignored to still be read by a future release.
3. Similarly, fixed schemas will always match based on the usual (name & size) constraints for fixed schemas, and the decimal logical type is ignored for matching. Again, I think this is consistent with allowing data to be decoded even when decimal logical type was not implemented previously. This may be the situation again for the Ruby implementation where there is not yet fixed decimal logical type encoding/decoding support.

The main downside that I see to the choices above is that decoding could return either a Decimal or the underlying Avro type value if decimal precision and scale do not match or the "decimal" value is invalid. This places more burden on applications if they are expecting a Decimal, but the upside is that data can be read without raising an error.

The other implication of these choices is that the (bytes) decimal logical type implementation may need additional error handling for cases where the "decimal" is invalid and the underlying Avro type value should be returned.

I'm open to other opinions, so please challenge this if you think the approach above is incorrect. If there are still questions I can take this to the mailing list to see if other maintainers have opinions on how this is supposed to work.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org