You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Kalle Niemitalo (Jira)" <ji...@apache.org> on 2022/06/21 08:16:00 UTC

[jira] [Commented] (AVRO-3012) Unknown logical types are not ignored during deserialization but lead to an exception.

    [ https://issues.apache.org/jira/browse/AVRO-3012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17556717#comment-17556717 ] 

Kalle Niemitalo commented on AVRO-3012:
---------------------------------------

If a schema with an unrecognized logical type were parsed as a LogicalSchema rather than as the base schema, would it be possible to give it a LogicalType whose ConvertToBaseValue and ConvertToLogicalValue methods just pass the data through without conversions? I suppose LogicalType.GetCSharpType would then have to return null or throw an exception, and its only caller CodeGen.getType would have to special-case this somehow. That doesn't really seem better than special-casing LogicalSchema.LogicalType == null.

Would it be useful for avrogen to emit the unsupported logicalType into the _SCHEMA property? Doing so would require parsing it as a LogicalSchema (or smuggling "logicalType" in Schema.Props and changing PrimitiveSchema.WriteJson). [Parsing Canonical Form|https://avro.apache.org/docs/1.11.0/spec.html#Parsing+Canonical+Form+for+Schemas] apparently does not include logicalType, but I don't know whether schema-registry software (e.g. by Confluent) cares about it.

> Unknown logical types are not ignored during deserialization but lead to an exception.
> --------------------------------------------------------------------------------------
>
>                 Key: AVRO-3012
>                 URL: https://issues.apache.org/jira/browse/AVRO-3012
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: csharp
>    Affects Versions: 1.9.2, 1.10.1
>            Reporter: Lucas Heimberg
>            Priority: Major
>
> By specification (see [https://avro.apache.org/docs/current/spec.html#Logical+Types] ), unknown logical types and logical types with invalid specification should be ignored during deserialization and instead, the object of the underlying Avro type should be returned.
> The C# implementation of Avro however raises an exception when such a logical type is found during parsing of a schema string. Therefore, it is not possible to parse a schema with a logical type using a version of the C# implementation of Avro that does not supports this logical type yet.
> In particular, it is therefore as an example not possible to consume a binary encoded Avro datum that makes use of the logical type UUID (as available in Avro 1.10.1) with a deserializer using C# Avro 1.9.2. (by specification, the deserializer should fall back to the underlying string representation of the UUID).
> This severly limits the downwards compability promised by the Avro specification.
>  
> During schema parsing, a LogicalSchema instance is created using the GetFromLogicalSchema method of the LogicalTypeFactory. This method has an optional parameter to ignore invalid logical  types, which is not used by the caller.
> It is not completely obvious to me, whether an unsupported logical type should still be parsed as a LogicalSchema, although without a LogicalType, or whether it should be directly parsed as the underlying Avro type (in the implementation also called BaseSchema) - i.e., whether the downgrading to the underlying Avro type should happen already during parsing or later during deserialization using the schema. 
> For the second case, a fix to the problem would also require an update to the ReadLogical method of the GenericReader class in order to support the case that the LogicalType of the writer schema is null.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)