You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Thiruvalluvan M. G. (Issue Comment Edited) (JIRA)" <ji...@apache.org> on 2011/10/07 19:58:30 UTC

[jira] [Issue Comment Edited] (AVRO-840) C++ generate nullable types for optional fields int the schema

    [ https://issues.apache.org/jira/browse/AVRO-840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123001#comment-13123001 ] 

Thiruvalluvan M. G. edited comment on AVRO-840 at 10/7/11 5:56 PM:
-------------------------------------------------------------------

The attached patch addresses the problem as follows:

For union types, there is new {{bool is_null() const;}} member function which can be used to query if the union is {{null}}. This complements the existing {{set_null();}} member function.

The names of union types are generated to avoid conflicts with other types. They are not easy for the human programmer to remember or make sense of. To help the human programmer, if the union type is a member of a record, the type is aliased to the member variable using {{typedef}}. That is, if there is a member variable {{rev}} which of union type {{_xyz_Union__0__}} then it is {{typedef}} ed to {{rev_t}}. The {{typedef}} is local to the record type, which makes conflict with other types rare.

However, if there is already another member with name {{rev_t}}, the code will fail to compile. To help in such situations, avrogencpp takes a new command-line switch {{- U}} or {{--no-union-typedef}}, which if present disables the {{typedef}} generation for unions within records.

A better way to handle this conflict is to allow annotations in schema. But with the current schema parser, it is hard to include support for annotations. We'll add this when we improve the C++ schema parser.
                
      was (Author: thiru_mg):
    The attached patch addresses the problem as follows:

For union types, there is new {{bool is_null() const;}} member function which can be used to query if the union is {{null}}. This complements the existing {{set_null();}} member function.

The names of union types are generated to avoid conflicts with other types. They are not easy for the human programmer to remember or make sense of. To help the human programmer, if the union type is a member of a record, the type is aliased to the member variable using {{typedef}}. That is, if there is a member variable {{rev}} which of union type {{_xyz_Union__0__}} then it is {{typedef}}ed to {{rev_t}}. The {{typedef}} is local to the record type, which makes conflict with other types rare.

However, if there is already another member with name {{rev_t}}, the code will fail to compile. To help in such situations, avrogencpp takes a new command-line switch {{- U}} or {{--no-union-typedef}}, which if present disables the {{typedef}} generation for unions within records.

A better way to handle this conflict is to allow annotations in schema. But with the current schema parser, it is hard to include support for annotations. We'll add this when we improve the C++ schema parser.
                  
> C++ generate nullable types for optional fields int the schema 
> ---------------------------------------------------------------
>
>                 Key: AVRO-840
>                 URL: https://issues.apache.org/jira/browse/AVRO-840
>             Project: Avro
>          Issue Type: Improvement
>          Components: c++
>            Reporter: Ramana Suvarapu
>            Priority: Critical
>              Labels: C++
>         Attachments: AVRO-840.patch
>
>
> To represent optional fields, we use unions in our schema. See the example below.
> {
>    "type" : "record",
>    "name" : "Contact",
>    "fields" : [ 
>     {"name" : "FirstName","type" : ["string" ]},
>     {"name" : "MiddleName","type" : [null, "string" ]},
>     {"name" : "LastName",  "type" : ["string" ]},
>     {"name" : "PhoneNum","type" : [null, "string" ]},
>     {"name" : "Id","type" : [null, "long" ]}
>    ] 
> }
> In this schema PhoneNum, MiddleName and Id fields are declared as unions as they are optional fields.  For this schema, Avrogencpp generates Contact structure and 3 separate union structures for each of the optional fields in the schema.
> struct  Contact
> {
>     String FirstName;
>     String LastName;
>     Union_0 MiddleName;
>     Union_1 PhoneNum;
>     Union_3 Id;                        
> }
> Instead is it possible to create a new template based NullableType to represent optional fields. Basically if the schema has union with 2 fields and first field is null, it should generate Nullable type.
> For the above scheme, it should generate something like this.
> struct  Contact
> {
>    String FirstName;
>    String LastName;
>    Nullable<string> MiddleName;
>    Nullable<string> PhoneNum;
>    Nullable< long > Id;                         
> }
> This will reduce the number of generated unions in generated code and improve the readability and usability of the code.
> Let me know if it's feasible to implement this.
> Thanks,
> Ramana

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira