You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Kalle Niemitalo (Jira)" <ji...@apache.org> on 2022/08/01 06:53:00 UTC

[jira] [Commented] (AVRO-3532) Align naming rules on code

    [ https://issues.apache.org/jira/browse/AVRO-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573572#comment-17573572 ] 

Kalle Niemitalo commented on AVRO-3532:
---------------------------------------

Names in Avro are case-sensitive, but one can generate C# source code from an Avro schema, compile it to a library, and use that library from a case-insensitive language such as Visual Basic or PowerShell. So, the schema author should choose names that won't conflict in the languages from which they are intended to be used (unless the code generators are able to remap the names). I think it would be okay to handle Unicode normalization in a similar way: require identical code points in the Avro schema, and let the target languages normalize the names how they like. The Avro specification could advise schema authors to choose names that won't conflict in popular languages, and there could be a validator tool that warns about potential conflicts, but I don't think it is necessary for every code generator to do this validation.

> Align naming rules on code
> --------------------------
>
>                 Key: AVRO-3532
>                 URL: https://issues.apache.org/jira/browse/AVRO-3532
>             Project: Apache Avro
>          Issue Type: Wish
>            Reporter: Christophe Le Saec
>            Priority: Major
>
> Description of [naming rule on documentation|https://avro.apache.org/docs/current/spec.html#names] is
> {noformat}
>     - start with [A-Za-z_]
>     - subsequently contain only [A-Za-z0-9_]
> {noformat}
> But [java code|https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/Schema.java#L1578] use Character.isLetter method
> {code:java}
>     char first = name.charAt(0);
>     if (!(Character.isLetter(first) || first == '_'))
>       throw new SchemaParseException("Illegal initial character: " + name);
>     for (int i = 1; i < length; i++) {
>       char c = name.charAt(i);
>       if (!(Character.isLetterOrDigit(c) || c == '_'))
>         throw new SchemaParseException("Illegal character in: " + name);
>     }
>     return name;
> {code}
> This method accept accent éùàçË ... and also chinese character (我) ...
> So, the aim of this ticket is to see if we can update the documentation, if other implementations (rust, C# ...) are also compatible with ?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)