You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2020/10/07 18:59:00 UTC

[jira] [Commented] (IMPALA-10172) Support Hive metastore managed locations for databases

    [ https://issues.apache.org/jira/browse/IMPALA-10172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209781#comment-17209781 ] 

ASF subversion and git services commented on IMPALA-10172:
----------------------------------------------------------

Commit 76f9b75c8b701e50378ce92a729f01768da55171 in impala's branch refs/heads/master from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=76f9b75 ]

IMPALA-10172: Support Hive metastore managed locations for databases

This change lets the user set the managed location path in new
databases, e.g.
CREATE DATABASE db MANAGEDLOCATION 'some url';

This property sets the location where the database's tables with
table property 'transactional'='true' will be placed.

The change also adds managedlocation to DESCRIBE DATABASE's output.
Example:
DESCRIBE DATABASE db;
+------------------+-----------------------------------------+---------+
| name             | location                                | comment |
+------------------+-----------------------------------------+---------+
| db               | hdfs://localhost:20500/test-warehouse/a |         |
| managedlocation: | hdfs://localhost:20500/test-warehouse/b |         |
+------------------+-----------------------------------------+---------+
DESCRIBE DATABASE EXTENDED db6;
+------------------+-----------------------------------------+---------+
| name             | location                                | comment |
+------------------+-----------------------------------------+---------+
| db               | hdfs://localhost:20500/test-warehouse/a |         |
| managedlocation: | hdfs://localhost:20500/test-warehouse/b |         |
| Owner:           |                                         |         |
|                  | csringhofer                             | USER    |
+------------------+-----------------------------------------+---------+

Note that Impala's output for DESCRIBE DATABASE (EXTENDED) is
different than Hive's, where a new column was added for each extra
piece of information, while Impala adds a new row to keep the 3 column
format. Changing to Hive's format would be preferable in my opinion,
but is a potentially breaking change.
See IMPALA-6686 for further discussion.

Testing:
- added FE and EE tests
- ran relevant tests

Change-Id: I925632a43ff224f762031e89981896722e453399
Reviewed-on: http://gerrit.cloudera.org:8080/16529
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Support Hive metastore managed locations for databases
> ------------------------------------------------------
>
>                 Key: IMPALA-10172
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10172
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Tristan Stevens
>            Assignee: Csaba Ringhofer
>            Priority: Major
>
> In Hive 3 a database can have both managed and (unmanaged) locations.
> Hive DDL syntax is as follows:
> {noformat}
> CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name
>   [COMMENT database_comment]
>   [LOCATION hdfs_path]
>   [MANAGEDLOCATION hdfs_path]
>   [WITH DBPROPERTIES (property_name=property_value, ...)];
> ALTER (DATABASE|SCHEMA) database_name SET MANAGEDLOCATION hdfs_path;
> {noformat}
> Right now, Impala does not appear to support this syntax.
> Also the {{DESCRIBE FORMATTED}} and {{DESCRIBE EXTENDED}} statements should display both the {{LOCATION}} and {{MANAGEDLOCATION}}.
> Example:
> {noformat}
> impala-shell -i host-2.user1-c5.my.example.com -d default -k --ssl --ca_cert=/opt/cloudera/security/pki/chain.pem
> Starting Impala Shell using Kerberos authentication
> Using service name 'impala'
> SSL is enabled
> Opened TCP connection to host-2.user1-c5.my.example.com:21000
> Connected to host-2.user1-c5.my.example.com:21000
> Server version: impalad version 3.4.0-SNAPSHOT RELEASE (build 25402784335c39cc24076d71dab7a3ccbd562094)
> Query: use `default`
> ***********************************************************************************
> Welcome to the Impala shell.
> (Impala Shell v3.4.0-SNAPSHOT (2540278) built on Wed Aug  5 11:07:32 UTC 2020)
> To see a summary of a query's progress that updates in real-time, run 'set
> LIVE_PROGRESS=1;'.
> ***********************************************************************************
> Query: use `default`
> [host-2.user1-c5.my.example.com:21000] default> create database dbnew4 LOCATION 'hdfs://host-1.user1-c5.my.example.com:8020/data/dbnewnew/external' MANAGEDLOCATION 'hdfs://host-1.user1-c5.my.example.com:8020/data/dbnewnew/managed';
> Query: create database dbnew4 LOCATION 'hdfs://host-1.user1-c5.my.example.com:8020/data/dbnewnew/external' MANAGEDLOCATION 'hdfs://host-1.user1-c5.my.example.com:8020/data/dbnewnew/managed'
> ERROR: ParseException: Syntax error in line 1:
> ...0/data/dbnewnew/external' MANAGEDLOCATION 'hdfs://ccyc...
>                              ^
> Encountered: IDENTIFIER
> Expected: AS, CACHED, PARTITION, TBLPROPERTIES, UNCACHED
> CAUSED BY: Exception: Syntax error
> [host-2.user1-c5.my.example.com:21000] default> alter database dbnewnew SET MANAGEDLOCATION='hdfs://host-1.user1-c5.my.example.com:8020/data/dbnewnew/managed';
> Query: alter database dbnewnew SET MANAGEDLOCATION='hdfs://host-1.user1-c5.my.example.com:8020/data/dbnewnew/managed'
> ERROR: ParseException: Syntax error in line 1:
> ...newnew SET MANAGEDLOCATION='hdfs://host-1.user1-...
>                              ^
> Encountered: =
> Expected: ROLE, IDENTIFIER
> CAUSED BY: Exception: Syntax error
> [host-2.user1-c5.my.example.com:21000] default> describe database formatted db_cust_loc3 ;
> Query: describe database formatted db_cust_loc3
> +--------------+-------------------------------------------------------------------+---------+
> | name         | location                                                          | comment |
> +--------------+-------------------------------------------------------------------+---------+
> | db_cust_loc3 | hdfs://host-1.user1-c5.my.example.com:8020/data/db_cust_loc3 |         |
> | Owner:       |                                                                   |         |
> |              | admin                                                             | USER    |
> +--------------+-------------------------------------------------------------------+---------+
> Fetched 3 row(s) in 0.03s
> [host-2.user1-c5.my.example.com:21000] default> describe database extended db_cust_loc3 ;
> Query: describe database extended db_cust_loc3
> +--------------+-------------------------------------------------------------------+---------+
> | name         | location                                                          | comment |
> +--------------+-------------------------------------------------------------------+---------+
> | db_cust_loc3 | hdfs://host-1.user1-c5.my.example.com:8020/data/db_cust_loc3 |         |
> | Owner:       |                                                                   |         |
> |              | admin                                                             | USER    |
> +--------------+-------------------------------------------------------------------+---------+
> Fetched 3 row(s) in 0.01s
> [host-2.user1-c5.my.example.com:21000] default>
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org