You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@avro.apache.org by "Ryan Skraba (Jira)" <ji...@apache.org> on 2020/06/01 13:38:00 UTC

[jira] [Commented] (AVRO-2659) Some full names are not correctly validated.

    [ https://issues.apache.org/jira/browse/AVRO-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17121026#comment-17121026 ] 

Ryan Skraba commented on AVRO-2659:
-----------------------------------

AVRO-1022 has an extensive discussion on whether or not Avro names should be opened to all Unicode.

Since the specification is currently pretty clear about names and other SDKs are (usually) following the spec for validation, I propose we align the Java SDK instead of loosening the spec.

> Some full names are not correctly validated.
> --------------------------------------------
>
>                 Key: AVRO-2659
>                 URL: https://issues.apache.org/jira/browse/AVRO-2659
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Ryan Skraba
>            Priority: Minor
>
> The Java implementation will parse a schema with an obviously bad name:
> {code:json}
> { "type":"record",
>   "name":"org.9apache.äv rö.🚎.$!#!@%$.Sïmplé۵",
>   "fields": [
>     {"name":"id", "type": "int"}
>   ]
> }
> {code}
> For Java:
>  * no rules are applied to the namespace part of a full name, and
>  * the {{isLetterOrDigit}} methods in Character accept accented or multibyte letters and digits
>  * _(ambiguous)_ when a full name is specified, the namespace attribute is entirely ignored. Should it be an error if it contains a bad namespace?
> We should be applying the rules to all of the dot-delimited parts of a full name and namespace (unless ignored).  Rules should be applied to named types, field names, and aliases consistently.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)