You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iceberg.apache.org by Huadong Liu <hu...@gmail.com> on 2021/04/28 00:52:33 UTC

Iceberg tables not using hive catalog's hive.metastore.warehouse.dir

Hi Iceberg Dev,

Iceberg tables with hive catalog are created under
hive.metastore.warehouse.dir/<db> by default. Different table locations
<https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/BaseMetastoreCatalog.html#createTable-org.apache.iceberg.catalog.TableIdentifier-org.apache.iceberg.Schema-org.apache.iceberg.PartitionSpec-java.lang.String-java.util.Map->
are picked than the default hive.metastore.warehouse.dir for various
reasons (e.g. ownership separation and improved performance). The catalog
namespace
<https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/hive/HiveCatalog.html#createNamespace-org.apache.iceberg.catalog.Namespace-java.util.Map->
has to be created under hive.metastore.warehouse.dir though. It's
effectively an empty directory if tables are created in other locations.

Is there any concern on creating tables
outside hive.metastore.warehouse.dir?

--
Huadong

Re: Iceberg tables not using hive catalog's hive.metastore.warehouse.dir

Posted by Huadong Liu <hu...@gmail.com>.
Hi Peter, thanks for the info. I will ticket issues in case they show up.

On Wed, Apr 28, 2021 at 4:14 AM Peter Vary <pv...@cloudera.com.invalid>
wrote:

> Hi Huadong,
>
> If the table location is not provided then the table will automatically be
> placed under the database (namespace) location, but if the location is
> provided then it could point to anywhere and the table should work.
> The default directory structure could help with organizing you data, but
> at the end is up to the user to decide.
>
> All that said, I do not know extensive tests around this so if you
> encounter issues, feel free to open a ticket.
>
> Thanks,
> Peter
>
> On Apr 28, 2021, at 02:52, Huadong Liu <hu...@gmail.com> wrote:
>
> Hi Iceberg Dev,
>
> Iceberg tables with hive catalog are created under
> hive.metastore.warehouse.dir/<db> by default. Different table locations
> <https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/BaseMetastoreCatalog.html#createTable-org.apache.iceberg.catalog.TableIdentifier-org.apache.iceberg.Schema-org.apache.iceberg.PartitionSpec-java.lang.String-java.util.Map->
> are picked than the default hive.metastore.warehouse.dir for various
> reasons (e.g. ownership separation and improved performance). The catalog
> namespace
> <https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/hive/HiveCatalog.html#createNamespace-org.apache.iceberg.catalog.Namespace-java.util.Map->
> has to be created under hive.metastore.warehouse.dir though. It's
> effectively an empty directory if tables are created in other locations.
>
> Is there any concern on creating tables
> outside hive.metastore.warehouse.dir?
>
> --
> Huadong
>
>
>

Re: Iceberg tables not using hive catalog's hive.metastore.warehouse.dir

Posted by Peter Vary <pv...@cloudera.com.INVALID>.
Hi Huadong,

If the table location is not provided then the table will automatically be placed under the database (namespace) location, but if the location is provided then it could point to anywhere and the table should work.
The default directory structure could help with organizing you data, but at the end is up to the user to decide.

All that said, I do not know extensive tests around this so if you encounter issues, feel free to open a ticket.

Thanks,
Peter

> On Apr 28, 2021, at 02:52, Huadong Liu <hu...@gmail.com> wrote:
> 
> Hi Iceberg Dev,
> 
> Iceberg tables with hive catalog are created under hive.metastore.warehouse.dir/<db> by default. Different table locations <https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/BaseMetastoreCatalog.html#createTable-org.apache.iceberg.catalog.TableIdentifier-org.apache.iceberg.Schema-org.apache.iceberg.PartitionSpec-java.lang.String-java.util.Map-> are picked than the default hive.metastore.warehouse.dir for various reasons (e.g. ownership separation and improved performance). The catalog namespace <https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/hive/HiveCatalog.html#createNamespace-org.apache.iceberg.catalog.Namespace-java.util.Map-> has to be created under hive.metastore.warehouse.dir though. It's effectively an empty directory if tables are created in other locations.
> 
> Is there any concern on creating tables outside hive.metastore.warehouse.dir?
> 
> --
> Huadong