You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Christophe Le Saec (Jira)" <ji...@apache.org> on 2022/06/09 14:09:00 UTC

[jira] [Comment Edited] (AVRO-3532) Align naming rules on code

    [ https://issues.apache.org/jira/browse/AVRO-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17552239#comment-17552239 ] 

Christophe Le Saec edited comment on AVRO-3532 at 6/9/22 2:08 PM:
------------------------------------------------------------------

About [discussion|https://lists.apache.org/thread/8q666sll1hkh7hq7kbdthlv2dd9dfnr8], i would have same opinion as Nilesh, lot of data sources like GCP storage he mentioned but also relational database, json file format ...

Transport data format like [Apache Arrow|https://arrow.apache.org/docs/format/Columnar.html#struct-layout] ({_}Each field must have a UTF8-encoded name{_}) or Amazon ion, ({_}that accept every json{_}) also accept UTF-8 name, and it competes directly with Avro.

So, i think it could be great to develop Avro to align it (but, i understand this would be a big work).


was (Author: JIRAUSER289541):
About [discussion|https://lists.apache.org/thread/8q666sll1hkh7hq7kbdthlv2dd9dfnr8], i would have same opinion as Nilesh, lot of data sources like GCP storage he mentioned but also relational database, json file format ...


Transport data format like [Apache Arrow|https://arrow.apache.org/docs/format/Columnar.html#struct-layout] ({_}Each field must have a UTF8-encoded name{_}) or Amazon ion, ({_}that accept every json{_}) also accept name, and it competes directly with Avro.


So, i think it could be great to develop Avro to align it (but, i understand this would be a big work).

> Align naming rules on code
> --------------------------
>
>                 Key: AVRO-3532
>                 URL: https://issues.apache.org/jira/browse/AVRO-3532
>             Project: Apache Avro
>          Issue Type: Wish
>            Reporter: Christophe Le Saec
>            Priority: Major
>
> Description of [naming rule on documentation|https://avro.apache.org/docs/current/spec.html#names] is
> {noformat}
>     - start with [A-Za-z_]
>     - subsequently contain only [A-Za-z0-9_]
> {noformat}
> But [java code|https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/Schema.java#L1578] use Character.isLetter method
> {code:java}
>     char first = name.charAt(0);
>     if (!(Character.isLetter(first) || first == '_'))
>       throw new SchemaParseException("Illegal initial character: " + name);
>     for (int i = 1; i < length; i++) {
>       char c = name.charAt(i);
>       if (!(Character.isLetterOrDigit(c) || c == '_'))
>         throw new SchemaParseException("Illegal character in: " + name);
>     }
>     return name;
> {code}
> This method accept accent éùàçË ... and also chinese character (我) ...
> So, the aim of this ticket is to see if we can update the documentation, if other implementations (rust, C# ...) are also compatible with ?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)