You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iceberg.apache.org by Ryan Blue <rb...@netflix.com.INVALID> on 2019/05/10 00:01:11 UTC

Re: Enums in Icebert

+Iceberg Dev List <de...@iceberg.apache.org>

Ryan,

Iceberg doesn't currently support enums because enums cause unnecessary
schema evolution problems. You can't delete a symbol from an enum without
breaking backward-compatibility with existing data. That means symbol
additions cannot be rolled back easily, which is an annoyance when a user
adds a symbol with a typo. There is also no corresponding SQL type or
syntax, so you can't manage an enum with DDL.

We could probably solve those issues, but there isn't much value provided
by enums. Most people that I've talked to want enums in order to save
space, but dictionary encoding does that automatically for strings, and
compression in Avro does a good job also. The other argument I've heard for
adding enums is that they provide a guarantee that values are in some set,
but adding a symbol violates that assumption in a way that's harder to deal
with than a validation. You could disallow adding new symbols, but then you
have a type that can't evolve, which is equally annoying. When compared to
using a string and validating assumptions about it if you need to, I think
using a string wins.

rb

On Thu, May 9, 2019 at 1:45 PM Ryan Peterson <ry...@wework.com>
wrote:

> Hi Ryan,
>
> Had a qq. Why does Iceberg not support Enums?
>
> Read through the Table Spec,
> https://docs.google.com/document/d/1Q-zL5lSCle6NEEdyfiYsXYzX_Q8Qf0ctMyGBKslOswA/edit#heading=h.lnntcfijclre,
> and didn't find an answer there.
>
> Thanks!
>
> --
> *WeWork | Ryan Peterson*
> Software Engineer, Data Platform
> wework.com <http://www.wework.com/>
>
> Create Your Life's Work
>


-- 
Ryan Blue
Software Engineer
Netflix