You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Vladimir Kralik (Jira)" <ji...@apache.org> on 2020/10/20 11:37:00 UTC

[jira] [Commented] (AVRO-2702) Avro ResolvingGrammarGenerator does not honor "avro.java.string" property in inner record schemas

    [ https://issues.apache.org/jira/browse/AVRO-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17217534#comment-17217534 ] 

Vladimir Kralik commented on AVRO-2702:
---------------------------------------

h2. Issue :

{code:java}
// final List<Integer> li = avroDecoded.getListOfString().stream()
//                                                .map(String::length)
//                                                .collect(Collectors.toList());
final List<Integer> li = avroDecoded.getListOfString().stream()
                                                  .map(Object::toString) // FIXME : Doesn't help
                                                  .map(String::length)
                                                  .collect(Collectors.toList());
{code}
{code}
java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be cast to java.lang.String
	at java.util.Iterator.forEachRemaining(Iterator.java:116)
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
{code}

h2. Workaround :
{code:java}
final List<Integer> li = ((List<?>)avroDecoded.getListOfString()).stream() // <= casting to List<?>
                                                  .map(Object::toString) // <= conversion to String
                                                  .map(String::length)
                                                  .collect(Collectors.toList());
{code}



> Avro ResolvingGrammarGenerator does not honor "avro.java.string" property in inner record schemas
> -------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-2702
>                 URL: https://issues.apache.org/jira/browse/AVRO-2702
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.9.1
>            Reporter: Thorsten Hake
>            Priority: Major
>              Labels: ClassCastException, Deserialize
>         Attachments: Bar.kt
>
>
> The type property "avro.java.string" is being used to qualify the CharSequence implementation of a string type in java. This property will be set in the java code generated by the avro maven plugin, if the <stringType> property is set to "String".
> However the ResolvingGrammarGenerator, which helps in matching the writer schema to the reader schema, does not honor this property for inner records within unions. Instead of deserializing to java.lang.String, the strings of the inner record will be deserialized to org.apache.avro.util.Utf8. String properties belonging to the outer record will be correctly deserialized to java.lang.String.
> If you try to deserialize an Avro record from a schema that has an inner record within an union type with the java code generated by the maven plugin (<stringType> is set to "String"), you'll get a ClassCastException:
> {noformat}
> Caused by: java.lang.ClassCastException: class org.apache.avro.util.Utf8 cannot be cast to class java.lang.String
> {noformat}
> This is because the generated java code expects the strings to be deserialized according to the "avro.java.string" property which does not happen for the inner record.
> I would expect that the deserializer treats the strings in the inner record the same as the strings in the outer record.
> Example:
> writer schema:
> {code:json}
> {
>   "type": "record",
>   "name": "foo",
>   "fields": [
>     {
>       "name": "k",
>       "type": "string"
>     },
>     {
>       "name": "value",
>       "type": [
>         "null",
>         {
>           "type": "record",
>           "name": "bar",
>           "fields": [
>             {
>               "name": "str",
>               "type": "string"
>             }
>           ]
>         }
>       ]
>     }
>   ]
> }
> {code}
>  reader schema:
> {code:json}
> {
>   "type": "record",
>   "name": "foo",
>   "fields": [
>     {
>       "name": "k",
>       "type": {
>         "type": "string",
>         "avro.java.string": "String"
>       }
>     },
>     {
>       "name": "value",
>       "type": [
>         "null",
>         {
>           "type": "record",
>           "name": "bar",
>           "fields": [
>             {
>               "name": "str",
>               "type": {
>                 "type": "string",
>                 "avro.java.string": "String"
>               }
>             }
>           ]
>         }
>       ]
>     }
>   ]
> }
> {code}
> You'll find some example kotlin code demonstrating the problem in the attached Bar.kt.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)