You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@avro.apache.org by "Christophe Le Saec (Jira)" <ji...@apache.org> on 2022/09/08 15:56:00 UTC

[jira] [Commented] (AVRO-2831) Resolver cannot find reader/writer schema difference of fields in Union

    [ https://issues.apache.org/jira/browse/AVRO-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17601895#comment-17601895 ] 

Christophe Le Saec commented on AVRO-2831:
------------------------------------------

I created this [PR|https://github.com/apache/avro/pull/1858] that just add a unit test to show that, at least, this issue is solved in current version (1.11.1).

> Resolver cannot find reader/writer schema difference of fields in Union
> -----------------------------------------------------------------------
>
>                 Key: AVRO-2831
>                 URL: https://issues.apache.org/jira/browse/AVRO-2831
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.9.0, 1.9.1, 1.9.2
>            Reporter: Jinge Dai
>            Priority: Blocker
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> When avro java library deserializes and tries to resolve fields in a Union, it will use the writer schema if the reader field type is same as writer's, but it ignores the properties and logical type.
> {code:java}
> private static boolean unionEquiv(Schema write, Schema read, Map<SeenPair, Boolean> seen) {
>   final Schema.Type wt = write.getType();
>   if (wt != read.getType()) {
>     return false;
>   }
> {code}
> However, I have hit two issues: 
>  # https://issues.apache.org/jira/browse/AVRO-2702 If producer is not java, the string field the strings of the inner record will be deserialized to org.apache.avro.util.Utf8 and then get error
> {code:java}
> Caused by: java.lang.ClassCastException: class org.apache.avro.util.Utf8 cannot be cast to class java.lang.String{code}
>  
> 2. I am using a built-in-house C# lib on the producer side, and it builds Date field schema as INT type without logicalTyple, this is bug on my side. However, AVRO 1.8.2 is able to handle it by using reader schema, whereas 1.9.2 gets error 
> {code:java}
> Caused by: java.lang.IllegalArgumentException: Parameters cannot be null! Parameter values:[-129, "int", null, org.apache.avro.data.TimeConversions$DateConversion@d3449b1]
> {code}
> The reason is,  although the consumer side generates the class properly and sets the field to be converted to logical type, but the convertToLogicalType failed because the writer schema is used and the logical type is null
> {code:java}
> public static Object convertToLogicalType(Object datum, Schema schema, LogicalType type, Conversion<?> conversion) {
>   if (datum == null) {
>     return null;
>   }
>   if (schema == null || type == null || conversion == null) {
>     throw new IllegalArgumentException("Parameters cannot be null! Parameter values:"
>         + Arrays.deepToString(new Object[] { datum, schema, type, conversion }));
>   }
> {code}
>  
> I am wondering if we should check type and properties in "unionEquiv" so that we can use the reader schema if there is difference. This complies the AVRO spec
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)