You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@avro.apache.org by "Rik Heijdens (Jira)" <ji...@apache.org> on 2022/09/29 13:02:00 UTC
[jira] [Comment Edited] (AVRO-3631) Fix serialization of structs containing Fixed fields
[ https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611026#comment-17611026 ]
Rik Heijdens edited comment on AVRO-3631 at 9/29/22 1:01 PM:
-------------------------------------------------------------
Okay, so I'm starting to understand the issue a bit more, and I added a few more test-cases to the branch that I [linked earlier|https://github.com/apache/avro/compare/master...privacy-com:avro:avro-3631/fix-fixed-serialization?expand=1].
Unlike what I initially thought, the compatibility problems with `Value::Fixed` do not appear to be isolated to serialization. It also affects Deserialization of a `Value::Record` wrapping a `Value::Fixed` into a Rust struct. I added a test-case in [12ef14b|https://github.com/apache/avro/commit/12ef14b6a5cc102bcc0317251cd37471148d4926] which illustrates this.
I did note however that this is consistent with the Serialization implementation: a Rust `[u8; 6]` is serialized into a `Value::Array<Value::Int>` as illustrated by [a31fcfc|https://github.com/apache/avro/commit/a31fcfc96493e3180490dec5622cab485bf7cd79].
However, I am unsure as to how we should move forward with this: at serialization time the Schema information is not available to the Serializer and thus it wouldn't know if we were expecting to serialize to `Value::Array<Value::Int>` or `Value::Fixed`.
I'll ponder on this for a bit, but would appreciate suggestions if you have any on how we can move forward with this [~mgrigorov]
was (Author: JIRAUSER293264):
Okay, so I'm starting to understand the issue a bit more, and I added a few more test-cases to the branch that I [linked earlier|https://github.com/apache/avro/compare/master...privacy-com:avro:avro-3631/fix-fixed-serialization?expand=1].
Unlike what I initially thought, the compatibility problems with `Value::Fixed` does not appear to be isolated to serialization. It also affects Deserialization of a `Value::Record` wrapping a `Value::Fixed` into a Rust struct. I added a test-case in [12ef14b|https://github.com/apache/avro/commit/12ef14b6a5cc102bcc0317251cd37471148d4926] which illustrates this.
I did not however that this is consistent with the Serialization implementation: a Rust `[u8; 6]` is serialized into a `Value::Array<Value::Int>` as illustrated by [a31fcfc|https://github.com/apache/avro/commit/a31fcfc96493e3180490dec5622cab485bf7cd79].
However, I am unsure as to how we should move forward with this: at serialization time the Schema information is not available to the Serializer and thus it wouldn't know if we were expecting to serialize to `Value::Array<Value::Int>` or `Value::Fixed`.
I'll ponder on this for a bit, but would appreciate suggestions if you have any on how we can move forward with this [~mgrigorov]
> Fix serialization of structs containing Fixed fields
> ----------------------------------------------------
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
> Issue Type: Bug
> Components: rust
> Reporter: Rik Heijdens
> Priority: Major
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to convert an instance of `TestStructFixedField` into an `Vec<u8>` using an instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` represents `field` as an `Value::Array<Value::Int>` rather than a `Value::Fixed<6, Vec<u8>` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array<Vec<Value::Int>> to pass validation if the array has the expected length, and none of the contents of the array are out-of-range for u8. If we go down this route, the implementation of `to_avro_datum()` will have to take care of converting Value::Int to u8 when converting into bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are converted into `Value::Fixed<N, Vec<u8>>` rather than `Value::Array`.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)