You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Joe McDonnell (Jira)" <ji...@apache.org> on 2020/01/24 18:06:00 UTC

[jira] [Resolved] (IMPALA-9068) Impala should respect the distinction between the managed warehouse and the external warehouse

     [ https://issues.apache.org/jira/browse/IMPALA-9068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joe McDonnell resolved IMPALA-9068.
-----------------------------------
    Fix Version/s: Impala 3.4.0
       Resolution: Fixed

> Impala should respect the distinction between the managed warehouse and the external warehouse
> ----------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-9068
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9068
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Infrastructure
>    Affects Versions: Impala 3.4.0
>            Reporter: Joe McDonnell
>            Priority: Blocker
>             Fix For: Impala 3.4.0
>
>
> Recent Hive 3 makes a distinction between the directory for managed tables and the directory for external tables.
> {code:java}
> WAREHOUSE("metastore.warehouse.dir", "hive.metastore.warehouse.dir", "/user/hive/warehouse",
>         "location of default database for the warehouse"),    WAREHOUSE_EXTERNAL("metastore.warehouse.external.dir",        "hive.metastore.warehouse.external.dir", "",
>         "Default location for external tables created in the warehouse. " +
>         "If not set or null, then the normal warehouse location will be used as the default location."),
> {code}
> With HIVE-22189, Hive is militantly enforcing the distinction. It no longer allows external tables in the hive.metastore.warehouse.dir (the managed directory). The create table statements are currently translated to create external table statements with appropriate table properties, but in order for this to work correctly, we need to specify hive.metastore.warehouse.external.dir to be different from hive.metastore.warehouse.dir. A sensible approach is to set hive.metastore.warehouse.external.dir to /test-warehouse and change hive.metastore.warehouse.dir to something else, like /test-warehouse-managed.
> This will require further changes in our test infrastructure to incorporate this distinction. For example, tests/comparison/cluster/cluster.py's warehouse_dir needs to handle this appropriately (this is needed for testdata/bin/load_nested.py). It may also require changes to some paths for tests that use managed tables.
> hive.metastore.warehouse.external.dir does not exist in Hive 2, so this will require some Hive 3 specific logic.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)