You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Thiruvalluvan M. G." <th...@yahoo.com> on 2010/12/07 11:58:57 UTC

Union with a single branch

The Java implementation allows unions with just one branch. But C++
implementation doesn't. The spec is silent in this respect.

Is there a need for single-branch unions?

There could be an argument that single-branch unions can be used for future
extensions. But I don't think it is needed because our resolution spec
allows matching standalone entities with unions as long as the entity's type
is one of the branches in the union.

Another argument could be that data written using single-branch union can be
read by multi-branch union without using schema resolution. But we do not
want to encourage such usage. If the schemas for reader and writer are
different (in whatever way) we want people to use schema resolution.

The only valid argument I could think of is that someone may already be
using single-branch unions. Tightening the spec will break their code.
Tightening spec will also means that all language implementations should fix
the problem, if they haven't already. In any case we need to make the
implementations consistent and make the specification explicit in this
regard.

Any thoughts?

Thanks

Thiru


Re: Union with a single branch

Posted by Doug Cutting <cu...@apache.org>.
On 12/07/2010 10:14 AM, Scott Carey wrote:
> Making all implementations capable of reading already persisted
> single-branch unions but incapable of writing them doesn't seem like
> a good way forward.  We probably have to just support single branch
> unions and put them in the spec.  I don't think that is a burden from
> a code maintenance point of view -- a single branch isn't much of a
> special case, its more of a degenerate case.  We should discourage
> their use though.

+1

Doug

Re: Union with a single branch

Posted by Scott Carey <sc...@richrelevance.com>.
On Dec 7, 2010, at 2:58 AM, Thiruvalluvan M. G. wrote:

> The Java implementation allows unions with just one branch. But C++
> implementation doesn't. The spec is silent in this respect.
> 
> Is there a need for single-branch unions?
> 
> There could be an argument that single-branch unions can be used for future
> extensions. But I don't think it is needed because our resolution spec
> allows matching standalone entities with unions as long as the entity's type
> is one of the branches in the union.

Agreed.

> 
> Another argument could be that data written using single-branch union can be
> read by multi-branch union without using schema resolution. But we do not
> want to encourage such usage. If the schemas for reader and writer are
> different (in whatever way) we want people to use schema resolution.
> 

Not only that, but the single branch union would have to coincide with the first branch of the multi branch union.  Thats asking for trouble.  Reader / Writer schema resolution is always required unless the schemas are identical.  The resolver could note that the written union's branch subset is smaller and in the same order as the reader's and thus compatible, but this compatibility check needs to be in the resolver, not left to the user.

> The only valid argument I could think of is that someone may already be
> using single-branch unions.

I fear that even if only the Java implementation supported single branch unions, there would likely still be persisted single branch unions out in the wild.

> Tightening the spec will break their code.
> Tightening spec will also means that all language implementations should fix
> the problem, if they haven't already. In any case we need to make the
> implementations consistent and make the specification explicit in this
> regard.

Making all implementations capable of reading already persisted single-branch unions but incapable of writing them doesn't seem like a good way forward.  We probably have to just support single branch unions and put them in the spec.  I don't think that is a burden from a code maintenance point of view -- a single branch isn't much of a special case, its more of a degenerate case.  We should discourage their use though.  Any idea what the other implementations do?  I suppose we need to add a single branch union to the interop tests and find out.
 
> 
> Any thoughts?
> 
> Thanks
> 
> Thiru
>