You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Marcel Pfütze (Jira)" <ji...@apache.org> on 2019/10/16 11:44:00 UTC

[jira] [Created] (AVRO-2597) GenricUnion not accessible

Marcel Pfütze created AVRO-2597:
-----------------------------------

             Summary: GenricUnion not accessible
                 Key: AVRO-2597
                 URL: https://issues.apache.org/jira/browse/AVRO-2597
             Project: Apache Avro
          Issue Type: Bug
          Components: c++
    Affects Versions: 1.9.1, 1.9.0
            Reporter: Marcel Pfütze


In version 1.9.0 GenricDatum.hh was expanded by functionality that was formerly just in the GenricUnion class.

The following was added:
* isUnion()
* selectBranch()
* unionBranch()

Additionally the methods to access access type and value were overwritten to have a special if case for the union data type: 
{code:c++}
inline Type GenericDatum::type() const {
    return (type_ == AVRO_UNION) ?
#if __cplusplus >= 201703L
        std::any_cast<GenericUnion>(&value_)->datum().type() :
#else
        boost::any_cast<GenericUnion>(&value_)->datum().type() :
#endif
        type_;
}

template<typename T> T& GenericDatum::value() {
    return (type_ == AVRO_UNION) ?
#if __cplusplus >= 201703L
        std::any_cast<GenericUnion>(&value_)->datum().value<T>() :
        *std::any_cast<T>(&value_);
#else
        boost::any_cast<GenericUnion>(&value_)->datum().value<T>() :
        *boost::any_cast<T>(&value_);
#endif
}
{code}

The result of this is, that it's impossible to access the GenericUnion from a GenricDatum as the code always points to the branch of the union.
Currently the only way to get the number of branches this union has, is by calling selectbranch() in a for-loop until the underlying leafAt() throws an exception.
In avro 1.8.2 the following code could be used to select a unionbranch of a specific type.
{code:c++}
inline avro::GenericDatum & select_union_branch(avro::GenericDatum & datum, const avro::Type type)
{
    auto & un     =  datum.value<avro::GenericUnion>();
    auto & schema = un.schema();
    auto leaves   = schema->leaves();
    for(std::size_t i=0; i < leaves; i++) {
        if(schema->leafAt(i)->type() == type) {
            un.selectBranch(i);
            return un.datum();
        }
    }
    throw std::runtime_error(
            "select_union_branch: can't find branch with type=" +
            std::to_string(type));
}
{code}

My proposals:
# Revert changes to make GenericDatum reflect methods of GenericUnion.
## A Union is a normal data type of Avro like Records or Lists. What is the point of hiding the type?
## The getters don't just return their value but do an extra check or even cast before returning. This is counterintuitive.
# Add the schema() method to the GenericDatum class. At least the full functionality of the GenricUnion should be accessible if you can't access the GenericUnion itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)