You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2019/05/31 16:04:35 UTC
[impala] 02/02: [DOCS] Fix UPDATE/UPSERT/DELETE authorization doc
This is an automated email from the ASF dual-hosted git repository.
tarmstrong pushed a commit to branch 2.x
in repository https://gitbox.apache.org/repos/asf/impala.git
commit 7bb209c173f920808465add475f6e0964eb273f1
Author: Fredy Wijaya <fw...@cloudera.com>
AuthorDate: Tue Jul 17 20:07:32 2018 -0700
[DOCS] Fix UPDATE/UPSERT/DELETE authorization doc
The patch also fixes broken links in the authorization doc.
Change-Id: I9bf6109636e44ca514cfe74fb565f7c506ec0708
Reviewed-on: http://gerrit.cloudera.org:8080/10975
Reviewed-by: Fredy Wijaya <fw...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Reviewed-on: http://gerrit.cloudera.org:8080/13474
Reviewed-by: Alex Rodoni <ar...@cloudera.com>
---
docs/shared/impala_common.xml | 14 +-
docs/topics/impala_authorization.xml | 1362 ++++++++--------------------------
2 files changed, 319 insertions(+), 1057 deletions(-)
diff --git a/docs/shared/impala_common.xml b/docs/shared/impala_common.xml
index b806690..f96e3d7 100644
--- a/docs/shared/impala_common.xml
+++ b/docs/shared/impala_common.xml
@@ -467,37 +467,37 @@ under the License.
<row>
<entry>UPDATE (Kudu only)</entry>
<entry>ALL</entry>
- <entry>SERVER</entry>
+ <entry>TABLE</entry>
</row>
<row>
<entry>EXPLAIN UPDATE (Kudu only)</entry>
<entry>ALL</entry>
- <entry>SERVER</entry>
+ <entry>TABLE</entry>
</row>
<row>
<entry>UPSERT (Kudu only)</entry>
<entry>ALL</entry>
- <entry>SERVER</entry>
+ <entry>TABLE</entry>
</row>
<row>
<entry>WITH UPSERT (Kudu only)</entry>
<entry>ALL</entry>
- <entry>SERVER</entry>
+ <entry>TABLE</entry>
</row>
<row>
<entry>EXPLAIN UPSERT (Kudu only)</entry>
<entry>ALL</entry>
- <entry>SERVER</entry>
+ <entry>TABLE</entry>
</row>
<row>
<entry>DELETE (Kudu only)</entry>
<entry>ALL</entry>
- <entry>SERVER</entry>
+ <entry>TABLE</entry>
</row>
<row>
<entry>EXPLAIN DELETE (Kudu only)</entry>
<entry>ALL</entry>
- <entry>SERVER</entry>
+ <entry>TABLE</entry>
</row>
</tbody>
</tgroup>
diff --git a/docs/topics/impala_authorization.xml b/docs/topics/impala_authorization.xml
index 300be65..6e63d27 100644
--- a/docs/topics/impala_authorization.xml
+++ b/docs/topics/impala_authorization.xml
@@ -81,24 +81,20 @@ under the License.
Table
Column
</codeblock>
-
- <p rev="2.3.0 collevelauth">
- The object hierarchy for Impala covers Server, URI, Database, Table, and Column. (The Table privileges apply to views as well;
- anywhere you specify a table name, you can specify a view name instead.)
- Column-level authorization is available in <keyword keyref="impala23_full"/> and higher.
- Previously, you constructed views to query specific columns and assigned privilege based on
- the views rather than the base tables. Now, you can use Impala's <xref href="impala_grant.xml"/> and
- <xref href="impala_revoke.xml"/> statements to assign and revoke privileges from specific columns
- in a table.
- </p>
+ <p rev="2.3.0 collevelauth"> The table-level privileges apply to views as
+ well. Anywhere you specify a table name, you can specify a view name
+ instead. </p>
+ <p rev="2.3.0 collevelauth"> In <keyword keyref="impala23_full"/> and
+ higher, you can specify privileges for individual columns. </p>
<p conref="../shared/impala_common.xml#common/sentry_privileges_objects"/>
-
- <p>
- Privileges can be specified for a table or view before that object actually exists. If you do not have
- sufficient privilege to perform an operation, the error message does not disclose if the object exists or
- not.
- </p>
+ <p> Originally, privileges were encoded in a policy file, stored in HDFS.
+ This mode of operation is still an option, but the emphasis of privilege
+ management is moving towards being SQL-based. The mode of operation with
+ <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements instead
+ of the policy file requires that a special Sentry service be enabled;
+ this service stores, retrieves, and manipulates privilege information
+ stored inside the metastore database. </p>
<note>
<p>
@@ -123,16 +119,11 @@ under the License.
</li>
</ul>
</note>
-
- <p>
- Originally, privileges were encoded in a policy file, stored in HDFS. This mode of operation is still an
- option, but the emphasis of privilege management is moving towards being SQL-based. Although currently
- Impala does not have <codeph>GRANT</codeph> or <codeph>REVOKE</codeph> statements, Impala can make use of
- privileges assigned through <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements done through
- Hive. The mode of operation with <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements instead of
- the policy file requires that a special Sentry service be enabled; this service stores, retrieves, and
- manipulates privilege information stored inside the metastore database.
- </p>
+ <p> If you change privileges in Sentry, e.g. adding a user, removing a
+ user, modifying privileges, you must clear the Impala Catalog server
+ cache by running the <codeph>INVALIDATE METADATA</codeph> statement.
+ <codeph>INVALIDATE METADATA</codeph> is not required if you make the
+ changes to privileges within Impala. </p>
</conbody>
</concept>
@@ -152,32 +143,30 @@ under the License.
<codeph>IMPALA_SERVER_ARGS</codeph> declaration in the <filepath>/etc/default/impala</filepath>
configuration file:
</p>
-
<ul>
<li>
- <codeph>-server_name</codeph>: Turns on Sentry authorization for Impala. The
- authorization rules refer to a symbolic server name, and you specify the same name to
- use as the argument to the <codeph>-server_name</codeph> option for all
- <cmdname>impalad</cmdname> nodes in the cluster.
- <p>
- Starting in Impala 1.4.0 and higher, if you specify just
- <codeph>-server_name</codeph> without <codeph>-authorization_policy_file</codeph>,
- Impala uses the Sentry service for authorization.
- </p>
+ <codeph>-server_name</codeph>: Turns on Sentry authorization for
+ Impala. The authorization rules refer to a symbolic server name, and
+ you specify the same name to use as the argument to the
+ <codeph>-server_name</codeph> option for all
+ <cmdname>impalad</cmdname> nodes in the cluster. <p> Starting in
+ Impala 1.4.0 and higher, if you specify just
+ <codeph>-server_name</codeph> without
+ <codeph>-authorization_policy_file</codeph>, Impala uses the
+ Sentry service for authorization. </p>
</li>
-
<li>
<codeph>-sentry_config</codeph>: Specifies the local path to the
- <codeph>sentry-site.xml</codeph> configuration file. This setting is required to
- enable authorization.
- </li>
-
+ <codeph>sentry-site.xml</codeph> configuration file. This setting is
+ required to enable authorization. </li>
<li>
- Specifying the <codeph>-authorization_policy_file</codeph> option in addition to
- <codeph>-server_name</codeph> makes Impala read privilege information from a policy file, rather than
- from the metastore database. The argument to the <codeph>-authorization_policy_file</codeph> option
- specifies the HDFS path to the policy file that defines the privileges on different schema objects.
- </li>
+ <codeph>-authorization_policy_file</codeph>: Specifies the HDFS path
+ to the policy file that defines the privileges on schema objects.
+ Prior to Impala 1.4.0, or if you want to continue storing privilege
+ rules in the policy file, specify the
+ <codeph>-authorization_policy_file</codeph> option to make Impala
+ read privilege information from a policy file, rather than from the
+ metastore database. </li>
</ul>
<p rev="1.4.0">
@@ -207,1065 +196,338 @@ under the License.
configuration file for Hive, as well as in the <codeph>-server_name</codeph> option for
<cmdname>impalad</cmdname>.
</p>
-
- <p>
- The preceding examples set up a symbolic name of <codeph>server1</codeph> to refer to the current instance
- of Impala. This symbolic name is used in the following ways:
- </p>
-
- <ul>
- <li>
- <p>
- Specify the <codeph>server1</codeph> value for the <codeph>sentry.hive.server</codeph> property in the
- <filepath>sentry-site.xml</filepath> configuration file for Hive, as well as in the
- <codeph>-server_name</codeph> option for <cmdname>impalad</cmdname>.
- </p>
- <p>
- If the <cmdname>impalad</cmdname> daemon is not already running, start it as described in
- <xref href="impala_processes.xml#processes"/>. If it is already running, restart it with the command
- <codeph>sudo /etc/init.d/impala-server restart</codeph>. Run the appropriate commands on all the nodes
- where <cmdname>impalad</cmdname> normally runs.
- </p>
- </li>
-
- <li>
- <p>
- If you use the mode of operation using the policy file, the rules in the <codeph>[roles]</codeph>
- section of the policy file refer to this same <codeph>server1</codeph> name. For example, the following
- rule sets up a role <codeph>report_generator</codeph> that lets users with that role query any table in
- a database named <codeph>reporting_db</codeph> on a node where the <cmdname>impalad</cmdname> daemon
- was started up with the <codeph>-server_name=server1</codeph> option:
- </p>
-<codeblock>[roles]
-report_generator = server=server1->db=reporting_db->table=*->action=SELECT
-</codeblock>
- </li>
- </ul>
-
- <p>
- When <cmdname>impalad</cmdname> is started with one or both of the <codeph>-server_name=server1</codeph>
- and <codeph>-authorization_policy_file</codeph> options, Impala authorization is enabled. If Impala detects
- any errors or inconsistencies in the authorization settings or the policy file, the daemon refuses to
- start.
- </p>
+ <p> Now restart the <cmdname>impalad</cmdname> daemons on all the nodes. </p>
</conbody>
</concept>
<concept id="sentry_service">
- <title>Using Impala with the Sentry Service (<keyword keyref="impala14"/> or higher only)</title>
-
- <conbody>
-
- <p>
- When you use the Sentry service rather than the policy file, you set up privileges through
- <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statement in either Impala or Hive, then both components
- use those same privileges automatically. (Impala added the <codeph>GRANT</codeph> and
- <codeph>REVOKE</codeph> statements in <keyword keyref="impala20_full"/>.)
- </p>
-
- </conbody>
- </concept>
-
- <concept id="security_policy_file">
-
- <title>Using Impala with the Sentry Policy File</title>
-
+ <title>Using Impala with the Sentry Service</title>
<conbody>
-
- <p>
- The policy file is a file that you put in a designated location in HDFS, and is read during the startup of
- the <cmdname>impalad</cmdname> daemon when you specify both the <codeph>-server_name</codeph> and
- <codeph>-authorization_policy_file</codeph> startup options. It controls which objects (databases, tables,
- and HDFS directory paths) can be accessed by the user who connects to <cmdname>impalad</cmdname>, and what
- operations that user can perform on the objects.
- </p>
-
- <note rev="1.4.0">
- <p rev="1.4.0">
- The Sentry service, as described in <xref href="impala_authorization.xml#sentry_service"/>, stores
- authorization metadata in a relational database. This means you can manage user privileges for Impala tables
- using traditional <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> SQL statements, rather than the
- policy file approach described here.If you are still using policy files, migrate to the
- database-backed service whenever practical.
- </p>
- </note>
-
- <p>
- The location of the policy file is listed in the <filepath>auth-site.xml</filepath> configuration file. To
- minimize overhead, the security information from this file is cached by each <cmdname>impalad</cmdname>
- daemon and refreshed automatically, with a default interval of 5 minutes. After making a substantial change
- to security policies, restart all Impala daemons to pick up the changes immediately.
- </p>
-
- <p>
- URIs represent the file paths you specify as part of statements such as <codeph>CREATE
- EXTERNAL TABLE</codeph> and <codeph>LOAD DATA</codeph>. Typically, you specify what look
- like UNIX paths, but these locations can also be prefixed with <codeph>hdfs://</codeph>
- to make clear that they are really URIs. To set privileges for a URI, specify the name
- of a directory, and the privilege applies to all the files in that directory and any
- directories underneath it.
- </p>
-
- <p>
- URIs must start with <codeph>hdfs://</codeph>, <codeph>s3a://</codeph>,
- <codeph>adl://</codeph>, or <codeph>file://</codeph>. If a URI starts with an absolute
- path, the path will be appended to the default filesystem prefix. For example, if you
- specify:
-<codeblock>
+ <p> When you use the Sentry service, you set up privileges through the
+ <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements in
+ either Impala or Hive. Then both components use those same privileges
+ automatically. (Impala added the <codeph>GRANT</codeph> and
+ <codeph>REVOKE</codeph> statements in <keyword keyref="impala20_full"
+ />.) </p>
+ <p> For information about using the Impala <codeph>GRANT</codeph> and
+ <codeph>REVOKE</codeph> statements, see <xref
+ href="impala_grant.xml#grant"/> and <xref
+ href="impala_revoke.xml#revoke"/>. </p>
+ <p> URIs represent the file paths you specify as part of statements such
+ as <codeph>CREATE EXTERNAL TABLE</codeph> and <codeph>LOAD
+ DATA</codeph>. Typically, you specify what look like UNIX paths, but
+ these locations can also be prefixed with <codeph>hdfs://</codeph> to
+ make clear that they are really URIs. To set privileges for a URI,
+ specify the name of a directory, and the privilege applies to all the
+ files in that directory and any directories underneath it. </p>
+ <p> URIs must start with <codeph>hdfs://</codeph>,
+ <codeph>s3a://</codeph>, <codeph>adl://</codeph>, or
+ <codeph>file://</codeph>. If a URI starts with an absolute path, the
+ path will be appended to the default filesystem prefix. For example, if
+ you specify: <codeblock>
GRANT ALL ON URI '/tmp';
-</codeblock>
- The above statement effectively becomes the following where the default filesystem is
- HDFS.
-<codeblock>
+</codeblock> The above
+ statement effectively becomes the following where the default filesystem
+ is HDFS.
+ <codeblock>
GRANT ALL ON URI 'hdfs://localhost:20500/tmp';
</codeblock>
</p>
-
- <p>
- When defining URIs for HDFS, you must also specify the NameNode. For example:
-<codeblock>GRANT ALL ON URI file:///path/to/dir TO <role>
+ <p> When defining URIs for HDFS, you must also specify the NameNode. For
+ example: <codeblock>GRANT ALL ON URI file:///path/to/dir TO <role>
GRANT ALL ON URI hdfs://namenode:port/path/to/dir TO <role></codeblock>
<note type="warning">
- <p>
- Because the NameNode host and port must be specified, it is strongly recommended
- that you use High Availability (HA). This ensures that the URI will remain constant
- even if the NameNode changes. For example:
- </p>
-<codeblock>GRANT ALL ON URI hdfs://ha-nn-uri/path/to/dir TO <role></codeblock>
+ <p> Because the NameNode host and port must be specified, it is
+ strongly recommended that you use High Availability (HA). This
+ ensures that the URI will remain constant even if the NameNode
+ changes. For example: </p>
+ <codeblock>GRANT ALL ON URI hdfs://ha-nn-uri/path/to/dir TO <role></codeblock>
</note>
</p>
-
</conbody>
-
- <concept id="security_policy_file_details">
-
- <title>Policy File Location and Format</title>
-
- <conbody>
-
- <p>
- The policy file uses the familiar <codeph>.ini</codeph> format, divided into the major sections
- <codeph>[groups]</codeph> and <codeph>[roles]</codeph>. There is also an optional
- <codeph>[databases]</codeph> section, which allows you to specify a specific policy file for a particular
- database, as explained in <xref href="#security_multiple_policy_files"/>. Another optional section,
- <codeph>[users]</codeph>, allows you to override the OS-level mapping of users to groups; that is an
- advanced technique primarily for testing and debugging, and is beyond the scope of this document.
- </p>
-
- <p>
- In the <codeph>[groups]</codeph> section, you define various categories of users and select which roles
- are associated with each category. The group and usernames correspond to Linux groups and users on the
- server where the <cmdname>impalad</cmdname> daemon runs.
- </p>
-
- <p>
- The group and usernames in the <codeph>[groups]</codeph> section correspond to Linux groups and users on
- the server where the <cmdname>impalad</cmdname> daemon runs. When you access Impala through the
- <cmdname>impalad</cmdname> interpreter, for purposes of authorization, the user is the logged-in Linux
- user and the groups are the Linux groups that user is a member of. When you access Impala through the
- ODBC or JDBC interfaces, the user and password specified through the connection string are used as login
- credentials for the Linux server, and authorization is based on that username and the associated Linux
- group membership.
- </p>
-
- <p>
- In the <codeph>[roles]</codeph> section, you a set of roles. For each role, you specify precisely the set
- of privileges is available. That is, which objects users with that role can access, and what operations
- they can perform on those objects. This is the lowest-level category of security information; the other
- sections in the policy file map the privileges to higher-level divisions of groups and users. In the
- <codeph>[groups]</codeph> section, you specify which roles are associated with which groups. The group
- and usernames correspond to Linux groups and users on the server where the <cmdname>impalad</cmdname>
- daemon runs. The privileges are specified using patterns like:
-<codeblock>server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=SELECT
-server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=CREATE
-server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=ALL
-</codeblock>
- For the <varname>server_name</varname> value, substitute the same symbolic name you specify with the
- <cmdname>impalad</cmdname> <codeph>-server_name</codeph> option. You can use <codeph>*</codeph> wildcard
- characters at each level of the privilege specification to allow access to all such objects. For example:
-<codeblock>server=impala-host.example.com->db=default->table=t1->action=SELECT
-server=impala-host.example.com->db=*->table=*->action=CREATE
-server=impala-host.example.com->db=*->table=audit_log->action=SELECT
-server=impala-host.example.com->db=default->table=t1->action=*
-</codeblock>
- </p>
-
- <p>
- When authorization is enabled, Impala uses the policy file as a <i>whitelist</i>, representing every
- privilege available to any user on any object. That is, only operations specified for the appropriate
- combination of object, role, group, and user are allowed; all other operations are not allowed. If a
- group or role is defined multiple times in the policy file, the last definition takes precedence.
- </p>
-
- <p>
- To understand the notion of whitelisting, set up a minimal policy file that does not provide any
- privileges for any object. When you connect to an Impala node where this policy file is in effect, you
- get no results for <codeph>SHOW DATABASES</codeph>, and an error when you issue any <codeph>SHOW
- TABLES</codeph>, <codeph>USE <varname>database_name</varname></codeph>, <codeph>DESCRIBE
- <varname>table_name</varname></codeph>, <codeph>SELECT</codeph>, and or other statements that expect to
- access databases or tables, even if the corresponding databases and tables exist.
- </p>
-
- <p>
- The contents of the policy file are cached, to avoid a performance penalty for each query. The policy
- file is re-checked by each <cmdname>impalad</cmdname> node every 5 minutes. When you make a
- non-time-sensitive change such as adding new privileges or new users, you can let the change take effect
- automatically a few minutes later. If you remove or reduce privileges, and want the change to take effect
- immediately, restart the <cmdname>impalad</cmdname> daemon on all nodes, again specifying the
- <codeph>-server_name</codeph> and <codeph>-authorization_policy_file</codeph> options so that the rules
- from the updated policy file are applied.
- </p>
- </conbody>
- </concept>
-
- <concept id="security_examples">
-
- <title>Examples of Policy File Rules for Security Scenarios</title>
-
- <conbody>
-
- <p>
- The following examples show rules that might go in the policy file to deal with various
- authorization-related scenarios. For illustration purposes, this section shows several very small policy
- files with only a few rules each. In your environment, typically you would define many roles to cover all
- the scenarios involving your own databases, tables, and applications, and a smaller number of groups,
- whose members are given the privileges from one or more roles.
- </p>
-
- <example id="sec_ex_unprivileged">
-
- <title>A User with No Privileges</title>
-
- <p>
- If a user has no privileges at all, that user cannot access any schema objects in the system. The error
- messages do not disclose the names or existence of objects that the user is not authorized to read.
- </p>
-
- <p>
-<!-- This example demonstrates the lack of privileges using a blank policy file, so no users have any privileges. -->
- This is the experience you want a user to have if they somehow log into a system where they are not an
- authorized Impala user. In a real deployment with a filled-in policy file, a user might have no
- privileges because they are not a member of any of the relevant groups mentioned in the policy file.
- </p>
-
-<!-- Have the raw material but not formatted into easily digestible example. Do for first 1.1 doc refresh.
-<codeblock></codeblock> -->
-
- </example>
-
- <example id="sec_ex_superuser">
-
- <title>Examples of Privileges for Administrative Users</title>
-
- <p>
- When an administrative user has broad access to tables or databases, the associated rules in the
- <codeph>[roles]</codeph> section typically use wildcards and/or inheritance. For example, in the
- following sample policy file, <codeph>db=*</codeph> refers to all databases and
- <codeph>db=*->table=*</codeph> refers to all tables in all databases.
- </p>
-
- <p>
- Omitting the rightmost portion of a rule means that the privileges apply to all the objects that could
- be specified there. For example, in the following sample policy file, the
- <codeph>all_databases</codeph> role has all privileges for all tables in all databases, while the
- <codeph>one_database</codeph> role has all privileges for all tables in one specific database. The
- <codeph>all_databases</codeph> role does not grant privileges on URIs, so a group with that role could
- not issue a <codeph>CREATE TABLE</codeph> statement with a <codeph>LOCATION</codeph> clause. The
- <codeph>entire_server</codeph> role has all privileges on both databases and URIs within the server.
- </p>
-
-<codeblock>[groups]
-supergroup = all_databases
-
-[roles]
-read_all_tables = server=server1->db=*->table=*->action=SELECT
-all_tables = server=server1->db=*->table=*
-all_databases = server=server1->db=*
-one_database = server=server1->db=test_db
-entire_server = server=server1
+ </concept>
+ <concept id="concept_k45_lbm_f2b">
+ <title>Examples of Setting up Authorization for Security Scenarios</title>
+ <conbody>
+ <p> The following examples show how to set up authorization to deal with
+ various scenarios. </p>
+ <example>
+ <title>A User with No Privileges</title>
+ <p> If a user has no privileges at all, that user cannot access any
+ schema objects in the system. The error messages do not disclose the
+ names or existence of objects that the user is not authorized to read. </p>
+ <p> This is the experience you want a user to have if they somehow log
+ into a system where they are not an authorized Impala user. Or in a
+ real deployment, a user might have no privileges because they are not
+ a member of any of the authorized groups. </p>
+ </example>
+ <example>
+ <title>Examples of Privileges for Administrative Users</title>
+ <p> In this example, the SQL statements grant the
+ <codeph>entire_server</codeph> role all privileges on both the
+ databases and URIs within the server. </p>
+ <codeblock>CREATE ROLE entire_server;
+GRANT ROLE entire_server TO GROUP admin_group;
+GRANT ALL ON SERVER server1 TO ROLE entire_server;
</codeblock>
+ </example>
+ <example>
+ <title>A User with Privileges for Specific Databases and Tables</title>
+ <p> If a user has privileges for specific tables in specific databases,
+ the user can access those things but nothing else. They can see the
+ tables and their parent databases in the output of <codeph>SHOW
+ TABLES</codeph> and <codeph>SHOW DATABASES</codeph>,
+ <codeph>USE</codeph> the appropriate databases, and perform the
+ relevant actions (<codeph>SELECT</codeph> and/or
+ <codeph>INSERT</codeph>) based on the table privileges. To actually
+ create a table requires the <codeph>ALL</codeph> privilege at the
+ database level, so you might define separate roles for the user that
+ sets up a schema and other users or applications that perform
+ day-to-day operations on the tables. </p>
+ <codeblock>
+CREATE ROLE one_database;
+GRANT ROLE one_database TO GROUP admin_group;
+GRANT ALL ON DATABASE db1 TO ROLE one_database;
+
+CREATE ROLE instructor;
+GRANT ROLE instructor TO GROUP trainers;
+GRANT ALL ON TABLE db1.lesson TO ROLE instructor;
- </example>
-
- <example id="sec_ex_detailed">
-
- <title>A User with Privileges for Specific Databases and Tables</title>
-
- <p>
- If a user has privileges for specific tables in specific databases, the user can access those things
- but nothing else. They can see the tables and their parent databases in the output of <codeph>SHOW
- TABLES</codeph> and <codeph>SHOW DATABASES</codeph>, <codeph>USE</codeph> the appropriate databases,
- and perform the relevant actions (<codeph>SELECT</codeph> and/or <codeph>INSERT</codeph>) based on the
- table privileges. To actually create a table requires the <codeph>ALL</codeph> privilege at the
- database level, so you might define separate roles for the user that sets up a schema and other users
- or applications that perform day-to-day operations on the tables.
- </p>
-
- <p>
- The following sample policy file shows some of the syntax that is appropriate as the policy file grows,
- such as the <codeph>#</codeph> comment syntax, <codeph>\</codeph> continuation syntax, and comma
- separation for roles assigned to groups or privileges assigned to roles.
- </p>
-
-<codeblock>[groups]
-employee = training_sysadmin, instructor
-visitor = student
-
-[roles]
-training_sysadmin = server=server1->db=training, \
-server=server1->db=instructor_private, \
-server=server1->db=lesson_development
-instructor = server=server1->db=training->table=*->action=*, \
-server=server1->db=instructor_private->table=*->action=*, \
-server=server1->db=lesson_development->table=lesson*
# This particular course is all about queries, so the students can SELECT but not INSERT or CREATE/DROP.
-student = server=server1->db=training->table=lesson_*->action=SELECT
-</codeblock>
-
- </example>
-
-<!--
-<example id="sec_ex_superuser_single_db">
-<title>A User with Full Privileges for a Specific Database</title>
-<p>
-
-</p>
-<codeblock></codeblock>
-</example>
-
-<example id="sec_ex_readonly_single_db">
-<title>A User with Read-Only Privileges for a Specific Database</title>
-<p>
-
-</p>
-<codeblock></codeblock>
- <p>
- If a user has <codeph>SELECT</codeph> privilege for a database, they can issue a <codeph>USE</codeph> statement
- for that database. Whether or not they can access tables within the database depends on further privileges
- defined at the table level.
- </p>
-
-<codeblock></codeblock>
-
- <li>
- The <codeph>staging_dir</codeph> role can specify the HDFS path
- <filepath>/user/impala-user/external_data</filepath> with the <codeph>LOAD
- DATA</codeph> statement. When Impala queries or loads data files, it operates on
- all the files in that directory, not just a single file, so any Impala
- <codeph>LOCATION</codeph> parameters refer to a directory rather than an
- individual file.
- </li>
- </ul>
-
-<codeblock></codeblock>
-</example>
-
-<example id="sec_ex_load_data">
-<title>A User with Privileges to Load Data but not Read Data</title>
-
- <p>
- If a user has <codeph>INSERT</codeph> privilege for a table, they can write to the table if it already exists.
- They cannot create or alter the table; those operations require the <codeph>ALL</codeph> privilege.
- </p>
-<codeblock></codeblock>
-</example>
--->
-
- <example id="sec_ex_external_files">
-
- <title>Privileges for Working with External Data Files</title>
-
- <p>
- When data is being inserted through the <codeph>LOAD DATA</codeph> statement, or is referenced from an
- HDFS location outside the normal Impala database directories, the user also needs appropriate
- permissions on the URIs corresponding to those HDFS locations.
- </p>
-
- <p>
- In this sample policy file:
- </p>
-
- <ul>
- <li>
- The <codeph>external_table</codeph> role lets us insert into and query the Impala table,
- <codeph>external_table.sample</codeph>.
- </li>
-
- <li>
- Members of the <codeph>impala_users</codeph> group have the
- <codeph>instructor</codeph> role and so can create, insert into, and query any
- tables in the <codeph>training</codeph> database, but cannot create or drop the
- database itself.
- </li>
-
- <li>
- The <codeph>username</codeph> under the <codeph>[groups]</codeph> section refers to the
- <codeph>username</codeph> group. (In this example, there is a <codeph>username</codeph> user
- that is a member of a <codeph>username</codeph> group.)
- </li>
- </ul>
-
- <p>
- Policy file:
- </p>
-
-<codeblock>[groups]
-username = external_table, staging_dir
-
-[roles]
-external_table_admin = server=server1->db=external_table
-external_table = server=server1->db=external_table->table=sample->action=*
-staging_dir = server=server1->uri=hdfs://127.0.0.1:8020/user/username/external_data->action=*
-</codeblock>
-
- <p>
- <cmdname>impala-shell</cmdname> session:
- </p>
-
-<codeblock>[localhost:21000] > use external_table;
-Query: use external_table
-[localhost:21000] > show tables;
-Query: show tables
-Query finished, fetching results ...
-+--------+
-| name |
-+--------+
-| sample |
-+--------+
-Returned 1 row(s) in 0.02s
-
-[localhost:21000] > select * from sample;
-Query: select * from sample
-Query finished, fetching results ...
-+-----+
-| x |
-+-----+
-| 1 |
-| 5 |
-| 150 |
-+-----+
-Returned 3 row(s) in 1.04s
-
-[localhost:21000] > load data inpath '/user/username/external_data' into table sample;
-Query: load data inpath '/user/username/external_data' into table sample
-Query finished, fetching results ...
-+----------------------------------------------------------+
-| summary |
-+----------------------------------------------------------+
-| Loaded 1 file(s). Total files in destination location: 2 |
-+----------------------------------------------------------+
-Returned 1 row(s) in 0.26s
-[localhost:21000] > select * from sample;
-Query: select * from sample
-Query finished, fetching results ...
-+-------+
-| x |
-+-------+
-| 2 |
-| 4 |
-| 6 |
-| 8 |
-| 64738 |
-| 49152 |
-| 1 |
-| 5 |
-| 150 |
-+-------+
-Returned 9 row(s) in 0.22s
-
-[localhost:21000] > load data inpath '/user/username/unauthorized_data' into table sample;
-Query: load data inpath '/user/username/unauthorized_data' into table sample
-ERROR: AuthorizationException: User 'username' does not have privileges to access: hdfs://127.0.0.1:8020/user/username/unauthorized_data
-</codeblock>
-
- </example>
-
- <example audience="hidden" id="sec_ex_views" rev="2.3.0 collevelauth">
-
- <title>Controlling Access at the Column Level through Views</title>
-
- <p>
- If a user has <codeph>SELECT</codeph> privilege for a view, they can query the view, even if they do
- not have any privileges on the underlying table. To see the details about the underlying table through
- <codeph>EXPLAIN</codeph> or <codeph>DESCRIBE FORMATTED</codeph> statements on the view, the user must
- also have <codeph>SELECT</codeph> privilege for the underlying table.
- </p>
-
- <note type="important">
- <p>
- The types of data that are considered sensitive and confidential differ depending on the jurisdiction
- the type of industry, or both. For fine-grained access controls, set up appropriate privileges based
- on all applicable laws and regulations.
- </p>
- <p>
- Be careful using the <codeph>ALTER VIEW</codeph> statement to point an existing view at a different
- base table or a new set of columns that includes sensitive or restricted data. Make sure that any
- users who have <codeph>SELECT</codeph> privilege on the view do not gain access to any additional
- information they are not authorized to see.
- </p>
- </note>
-
- <p>
- The following example shows how a system administrator could set up a table containing some columns
- with sensitive information, then create a view that only exposes the non-confidential columns.
- </p>
-
- <note rev="1.4.0">
- In <ph rev="upstream">CDH 5</ph> and higher, <ph
- rev="upstream">Cloudera</ph>
- recommends managing privileges through SQL statements, as described in
- <xref
- href="impala_authorization.xml#sentry_service"/>. If you are still using
- policy files, plan to migrate to the new approach some time in the future.
- </note>
-
- <p>
- Then the following policy file specifies read-only privilege for that view, without authorizing access
- to the underlying table:
- </p>
-
-<codeblock>[groups]
-employee = view_only_privs
-
-[roles]
-view_only_privs = server=server1->db=reports->table=name_address_view->action=SELECT
-</codeblock>
-
- <p>
- Thus, a user with the <codeph>view_only_privs</codeph> role could access through Impala queries the
- basic information but not the sensitive information, even if both kinds of information were part of the
- same data file.
- </p>
-
- <p>
- You might define other views to allow users from different groups to query different sets of columns.
- </p>
-
- </example>
-
- <example id="sec_sysadmin">
-
- <title>Separating Administrator Responsibility from Read and Write Privileges</title>
-
- <p>
- Remember that to create a database requires full privilege on that database, while day-to-day
- operations on tables within that database can be performed with lower levels of privilege on specific
- table. Thus, you might set up separate roles for each database or application: an administrative one
- that could create or drop the database, and a user-level one that can access only the relevant tables.
- </p>
-
- <p>
- For example, this policy file divides responsibilities between users in 3 different groups:
- </p>
-
- <ul>
- <li>
- Members of the <codeph>supergroup</codeph> group have the <codeph>training_sysadmin</codeph> role and
- so can set up a database named <codeph>training</codeph>.
- </li>
-
- <li> Members of the <codeph>employee</codeph> group have the
- <codeph>instructor</codeph> role and so can create, insert into,
- and query any tables in the <codeph>training</codeph> database,
- but cannot create or drop the database itself. </li>
-
- <li>
- Members of the <codeph>visitor</codeph> group have the <codeph>student</codeph> role and so can query
- those tables in the <codeph>training</codeph> database.
- </li>
- </ul>
-
- <p>
- In the <codeph>[roles]</codeph> section, you a set of roles. For each role, you
- specify precisely the set of privileges is available. That is, which objects users
- with that role can access, and what operations they can perform on those objects. This
- is the lowest-level category of security information; the other sections in the policy
- file map the privileges to higher-level divisions of groups and users. In the
- <codeph>[groups]</codeph> section, you specify which roles are associated with which
- groups. The group and usernames correspond to Linux groups and users on the server
- where the <cmdname>impalad</cmdname> daemon runs. The privileges are specified using
- patterns like:
-<codeblock>server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=SELECT
+CREATE ROLE student;
+GRANT ROLE student TO GROUP visitors;
+GRANT SELECT ON TABLE db1.training TO ROLE student;</codeblock>
+ </example>
+ <example>
+ <title>Privileges for Working with External Data Files</title>
+ <p> When data is being inserted through the <codeph>LOAD DATA</codeph>
+ statement, or is referenced from an HDFS location outside the normal
+ Impala database directories, the user also needs appropriate
+ permissions on the URIs corresponding to those HDFS locations. </p>
+ <p> In this example: </p>
+ <ul>
+ <li> The <codeph>external_table</codeph> role can insert into and
+ query the Impala table, <codeph>external_table.sample</codeph>. </li>
+ <li> The <codeph>staging_dir</codeph> role can specify the HDFS path
+ <filepath>/user/impala-user/external_data</filepath> with the
+ <codeph>LOAD DATA</codeph> statement. When Impala queries or loads
+ data files, it operates on all the files in that directory, not just
+ a single file, so any Impala <codeph>LOCATION</codeph> parameters
+ refer to a directory rather than an individual file. </li>
+ </ul>
+ <codeblock>CREATE ROLE external_table;
+GRANT ROLE external_table TO GROUP impala_users;
+GRANT ALL ON TABLE external_table.sample TO ROLE external_table;
+
+CREATE ROLE staging_dir;
+GRANT ROLE staging TO GROUP impala_users;
+GRANT ALL ON URI 'hdfs://127.0.0.1:8020/user/impala-user/external_data' TO ROLE staging_dir;</codeblock>
+ </example>
+ <example>
+ <title>Separating Administrator Responsibility from Read and Write
+ Privileges</title>
+ <p> To create a database, you need the full privilege on that database
+ while day-to-day operations on tables within that database can be
+ performed with lower levels of privilege on specific table. Thus, you
+ might set up separate roles for each database or application: an
+ administrative one that could create or drop the database, and a
+ user-level one that can access only the relevant tables. </p>
+ <p> In this example, the responsibilities are divided between users in 3
+ different groups: </p>
+ <ul>
+ <li> Members of the <codeph>supergroup</codeph> group have the
+ <codeph>training_sysadmin</codeph> role and so can set up a
+ database named <codeph>training</codeph>. </li>
+ <li> Members of the <codeph>impala_users</codeph> group have the
+ <codeph>instructor</codeph> role and so can create, insert into,
+ and query any tables in the <codeph>training</codeph> database, but
+ cannot create or drop the database itself. </li>
+ <li> Members of the <codeph>visitor</codeph> group have the
+ <codeph>student</codeph> role and so can query those tables in the
+ <codeph>training</codeph> database. </li>
+ </ul>
+ <codeblock>CREATE ROLE training_sysadmin;
+GRANT ROLE training_sysadmin TO GROUP supergroup;
+GRANT ALL ON DATABASE training1 TO ROLE training_sysadmin;
+
+CREATE ROLE instructor;
+GRANT ROLE instructor TO GROUP impala_users;
+GRANT ALL ON TABLE training1.course1 TO ROLE instructor;
+
+CREATE ROLE visitor;
+GRANT ROLE student TO GROUP visitor;
+GRANT SELECT ON TABLE training1.course1 TO ROLE student;</codeblock>
+ </example>
+ </conbody>
+ </concept>
+ <concept id="security_policy_file">
+ <title>Using Impala with the Sentry Policy File</title>
+ <conbody>
+ <p> The policy file is a file that you put in a designated location in
+ HDFS, and is read during the startup of the <cmdname>impalad</cmdname>
+ daemon when you specify both the <codeph>-server_name</codeph> and
+ <codeph>-authorization_policy_file</codeph> startup options. It
+ controls which objects (databases, tables, and HDFS directory paths) can
+ be accessed by the user who connects to <cmdname>impalad</cmdname>, and
+ what operations that user can perform on the objects. </p>
+ <note rev="1.4.0"> In <ph rev="upstream">CDH 5</ph> and higher, <ph
+ rev="upstream">Cloudera</ph> recommends managing privileges through
+ SQL statements, as described in <xref
+ href="impala_authorization.xml#sentry_service"/>. If you are still
+ using policy files, plan to migrate to the new approach some time in the
+ future. </note>
+ <p> The location of the policy file is listed in the
+ <filepath>auth-site.xml</filepath> configuration file. </p>
+ <p> When authorization is enabled, Impala uses the policy file as a
+ <i>whitelist</i>, representing every privilege available to any user
+ on any object. That is, only operations specified for the appropriate
+ combination of object, role, group, and user are allowed. All other
+ operations are not allowed. If a group or role is defined multiple times
+ in the policy file, the last definition takes precedence. </p>
+ <p> To understand the notion of whitelisting, set up a minimal policy file
+ that does not provide any privileges for any object. When you connect to
+ an Impala node where this policy file is in effect, you get no results
+ for <codeph>SHOW DATABASES</codeph>, and an error when you issue any
+ <codeph>SHOW TABLES</codeph>, <codeph>USE
+ <varname>database_name</varname></codeph>, <codeph>DESCRIBE
+ <varname>table_name</varname></codeph>, <codeph>SELECT</codeph>, and
+ or other statements that expect to access databases or tables, even if
+ the corresponding databases and tables exist. </p>
+ <p> The contents of the policy file are cached, to avoid a performance
+ penalty for each query. The policy file is re-checked by each
+ <cmdname>impalad</cmdname> node every 5 minutes. When you make a
+ non-time-sensitive change such as adding new privileges or new users,
+ you can let the change take effect automatically a few minutes later. If
+ you remove or reduce privileges, and want the change to take effect
+ immediately, restart the <cmdname>impalad</cmdname> daemon on all nodes,
+ again specifying the <codeph>-server_name</codeph> and
+ <codeph>-authorization_policy_file</codeph> options so that the rules
+ from the updated policy file are applied. </p>
+ </conbody>
+ <concept id="security_policy_file_details">
+ <title>Policy File Format</title>
+ <conbody>
+ <p> The policy file uses the familiar <codeph>.ini</codeph> format,
+ divided into the major sections <codeph>[groups]</codeph> and
+ <codeph>[roles]</codeph>. </p>
+ <p> There is also an optional <codeph>[databases]</codeph> section,
+ which allows you to specify a specific policy file for a particular
+ database, as explained in <xref href="#security_multiple_policy_files"
+ />. </p>
+ <p> Another optional section, <codeph>[users]</codeph>, allows you to
+ override the OS-level mapping of users to groups; that is an advanced
+ technique primarily for testing and debugging, and is beyond the scope
+ of this document. </p>
+ <p> In the <codeph>[groups]</codeph> section, you define various
+ categories of users and select which roles are associated with each
+ category. The group and usernames correspond to Linux groups and users
+ on the server where the <cmdname>impalad</cmdname> daemon runs. </p>
+ <p> The group and usernames in the <codeph>[groups]</codeph> section
+ correspond to Hadoop groups and users on the server where the
+ <cmdname>impalad</cmdname> daemon runs. When you access Impala
+ through the <cmdname>impalad</cmdname> interpreter, for purposes of
+ authorization, the user is the logged-in Linux user and the groups are
+ the Linux groups that user is a member of. When you access Impala
+ through the ODBC or JDBC interfaces, the user and password specified
+ through the connection string are used as login credentials for the
+ Linux server, and authorization is based on that username and the
+ associated Linux group membership. </p>
+ <p> In the <codeph>[roles]</codeph> section, you a set of roles. For
+ each role, you specify precisely the set of privileges is available.
+ That is, which objects users with that role can access, and what
+ operations they can perform on those objects. This is the lowest-level
+ category of security information; the other sections in the policy
+ file map the privileges to higher-level divisions of groups and users.
+ In the <codeph>[groups]</codeph> section, you specify which roles are
+ associated with which groups. The group and usernames correspond to
+ Linux groups and users on the server where the
+ <cmdname>impalad</cmdname> daemon runs. The privileges are specified
+ using patterns like:
+ <codeblock>server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=SELECT
server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=t<varname>able_name</varname>->action=CREATE
server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=ALL
</codeblock>
- For the <varname>server_name</varname> value, substitute the same symbolic name you
- specify with the <cmdname>impalad</cmdname> <codeph>-server_name</codeph> option. You
- can use <codeph>*</codeph> wildcard characters at each level of the privilege
- specification to allow access to all such objects. For example:
-<codeblock>server=impala-host.example.com->db=default->table=t1->action=SELECT
+ For the <varname>server_name</varname> value, substitute the same
+ symbolic name you specify with the <cmdname>impalad</cmdname>
+ <codeph>-server_name</codeph> option. You can use <codeph>*</codeph>
+ wildcard characters at each level of the privilege specification to
+ allow access to all such objects. For example:
+ <codeblock>server=impala-host.example.com->db=default->table=t1->action=SELECT
server=impala-host.example.com->db=*->table=*->action=CREATE
server=impala-host.example.com->db=*->table=audit_log->action=SELECT
server=impala-host.example.com->db=default->table=t1->action=*
</codeblock>
</p>
-
- </example>
</conbody>
</concept>
-
<concept id="security_multiple_policy_files">
-
<title>Using Multiple Policy Files for Different Databases</title>
-
<conbody>
-
- <p>
- For an Impala cluster with many databases being accessed by many users and applications, it might be
- cumbersome to update the security policy file for each privilege change or each new database, table, or
- view. You can allow security to be managed separately for individual databases, by setting up a separate
- policy file for each database:
- </p>
-
+ <p> For an Impala cluster with many databases being accessed by many
+ users and applications, it might be cumbersome to update the security
+ policy file for each privilege change or each new database, table, or
+ view. You can allow security to be managed separately for individual
+ databases, by setting up a separate policy file for each database: </p>
<ul>
- <li>
- Add the optional <codeph>[databases]</codeph> section to the main policy file.
- </li>
-
- <li>
- Add entries in the <codeph>[databases]</codeph> section for each database that has its own policy file.
- </li>
-
- <li>
- For each listed database, specify the HDFS path of the appropriate policy file.
- </li>
+ <li> Add the optional <codeph>[databases]</codeph> section to the main
+ policy file. </li>
+ <li> Add entries in the <codeph>[databases]</codeph> section for each
+ database that has its own policy file. </li>
+ <li> For each listed database, specify the HDFS path of the
+ appropriate policy file. </li>
</ul>
-
- <p>
- For example:
- </p>
-
-<codeblock>[databases]
+ <p> For example: </p>
+ <codeblock>[databases]
# Defines the location of the per-DB policy files for the 'customers' and 'sales' databases.
customers = hdfs://ha-nn-uri/etc/access/customers.ini
sales = hdfs://ha-nn-uri/etc/access/sales.ini
</codeblock>
-
- <p>
- To enable URIs in per-DB policy files, the Java configuration option <codeph>sentry.allow.uri.db.policyfile</codeph>
- must be set to <codeph>true</codeph>. For example:
- </p>
-
-<codeblock>JAVA_TOOL_OPTIONS="-Dsentry.allow.uri.db.policyfile=true"
+ <p> To enable URIs in per-DB policy files, the Java configuration option
+ <codeph>sentry.allow.uri.db.policyfile</codeph> must be set to
+ <codeph>true</codeph>. For example: </p>
+ <codeblock>JAVA_TOOL_OPTIONS="-Dsentry.allow.uri.db.policyfile=true"
</codeblock>
-
- <note type="important">
- Enabling URIs in per-DB policy files introduces a security risk by allowing the owner of the db-level
- policy file to grant himself/herself load privileges to anything the <codeph>impala</codeph> user has
- read permissions for in HDFS (including data in other databases controlled by different db-level policy
- files).
- </note>
+ <note type="important"> Enabling URIs in per-DB policy files introduces
+ a security risk by allowing the owner of the db-level policy file to
+ grant himself/herself load privileges to anything the
+ <codeph>impala</codeph> user has read permissions for in HDFS
+ (including data in other databases controlled by different db-level
+ policy files). </note>
</conbody>
</concept>
</concept>
-
<concept id="security_schema">
-
<title>Setting Up Schema Objects for a Secure Impala Deployment</title>
-
<conbody>
-
- <p>
- Remember that in your role definitions, you specify privileges at the level of individual databases and
- tables, or all databases or all tables within a database. To simplify the structure of these rules, plan
- ahead of time how to name your schema objects so that data with different authorization requirements is
- divided into separate databases.
- </p>
-
- <p>
- If you are adding security on top of an existing Impala deployment, remember that you can rename tables or
- even move them between databases using the <codeph>ALTER TABLE</codeph> statement. In Impala, creating new
- databases is a relatively inexpensive operation, basically just creating a new directory in HDFS.
- </p>
-
- <p>
- You can also plan the security scheme and set up the policy file before the actual schema objects named in
- the policy file exist. Because the authorization capability is based on whitelisting, a user can only
- create a new database or table if the required privilege is already in the policy file: either by listing
- the exact name of the object being created, or a <codeph>*</codeph> wildcard to match all the applicable
- objects within the appropriate container.
- </p>
+ <p> In your role definitions, you must specify privileges at the level of
+ individual databases and tables, or all databases or all tables within a
+ database. To simplify the structure of these rules, plan ahead of time
+ how to name your schema objects so that data with different
+ authorization requirements is divided into separate databases. </p>
+ <p> If you are adding security on top of an existing Impala deployment,
+ you can rename tables or even move them between databases using the
+ <codeph>ALTER TABLE</codeph> statement. </p>
</conbody>
</concept>
-
- <concept id="security_privileges">
-
- <title>Privilege Model and Object Hierarchy</title>
-
- <conbody>
-
- <p>
- Privileges can be granted on different objects in the schema. Any privilege that can be granted is
- associated with a level in the object hierarchy. If a privilege is granted on a container object in the
- hierarchy, the child object automatically inherits it. This is the same privilege model as Hive and other
- database systems such as MySQL.
- </p>
-
- <p>
- The kinds of objects in the schema hierarchy are:
- </p>
-
-<codeblock>Server
-URI
-Database
- Table
-</codeblock>
-
- <p>
- The server name is specified by the <codeph>-server_name</codeph> option when <cmdname>impalad</cmdname>
- starts. Specify the same name for all <cmdname>impalad</cmdname> nodes in the cluster.
- </p>
-
- <p>
- URIs represent the file paths you specify as part of statements such
- as <codeph>CREATE EXTERNAL TABLE</codeph> and <codeph>LOAD
- DATA</codeph>. Typically, you specify what look like UNIX paths, but
- these locations can also be prefixed with <codeph>hdfs://</codeph> to
- make clear that they are really URIs. To set privileges for a URI,
- specify the name of a directory, and the privilege applies to all the
- files in that directory and any directories underneath it.
- </p>
-
- <p rev="2.3.0 collevelauth">
- In <keyword keyref="impala23_full"/> and higher, you can specify privileges for individual columns.
- Formerly, to specify read privileges at this level, you created a view that queried specific columns
- and/or partitions from a base table, and gave <codeph>SELECT</codeph> privilege on the view but not
- the underlying table. Now, you can use Impala's <xref href="impala_grant.xml"/> and
- <xref href="impala_revoke.xml"/> statements to assign and revoke privileges from specific columns
- in a table.
- </p>
-
- <p>
- URIs must start with <codeph>hdfs://</codeph>,
- <codeph>s3a://</codeph>, <codeph>adl://</codeph>, or
- <codeph>file://</codeph>. If a URI starts with an absolute path, the
- path will be appended to the default filesystem prefix. For example, if
- you specify:
-<codeblock>
-GRANT ALL ON URI '/tmp';
-</codeblock> The above
- statement effectively becomes the following where the default filesystem
- is HDFS.
-<codeblock>
-GRANT ALL ON URI 'hdfs://localhost:20500/tmp';
-</codeblock>
- </p>
-
- <p>
- When defining URIs for HDFS, you must also specify the NameNode. For
- example:
-<codeblock>
-data_read = server=server1->uri=file:///path/to/dir, \
-server=server1->uri=hdfs://namenode:port/path/to/dir
-</codeblock>
- <note type="warning">
- <p>
- Because the NameNode host and port must be specified, enable High
- Availability (HA) to ensure that the URI will remain constant even
- if the NameNode changes.
- </p>
-<codeblock>
-data_read = server=server1->uri=file:///path/to/dir, \
-server=server1->uri=hdfs://ha-nn-uri/path/to/dir
-</codeblock>
- </note>
- </p>
-
-<!-- Experiment with replacing my original copied table with a conref'ed version of Ambreen's from the Security Guide. -->
-
-<!--
- <table>
- <title>Sentry privilege types and objects they apply to</title>
- <tgroup cols="2">
- <colspec colnum="1" colname="col1"/>
- <colspec colnum="2" colname="col2"/>
- <tbody>
- <row>
- <entry>Privilege</entry>
- <entry>Object</entry>
- </row>
- <row>
- <entry>INSERT</entry>
- <entry>TABLE, URI</entry>
- </row>
- <row>
- <entry>SELECT</entry>
- <entry>TABLE, VIEW, URI</entry>
- </row>
- <row>
- <entry>ALL</entry>
- <entry>SERVER, DB, URI</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
--->
-
- <table conref="../shared/impala_common.xml#common/sentry_privileges_objects">
- <tgroup cols="2">
- <colspec colnum="1" colname="col1" colwidth="1*"/>
- <tbody>
- <row>
- <entry/>
- </row>
- </tbody>
- </tgroup>
- </table>
-
- <note>
- <p>
- Although this document refers to the <codeph>ALL</codeph> privilege, currently if you use the policy file
- mode, you do not use the actual keyword <codeph>ALL</codeph> in the policy file. When you code role
- entries in the policy file:
- </p>
- <ul>
- <li>
- To specify the <codeph>ALL</codeph> privilege for a server, use a role like
- <codeph>server=<varname>server_name</varname></codeph>.
- </li>
-
- <li>
- To specify the <codeph>ALL</codeph> privilege for a database, use a role like
- <codeph>server=<varname>server_name</varname>->db=<varname>database_name</varname></codeph>.
- </li>
-
- <li>
- To specify the <codeph>ALL</codeph> privilege for a table, use a role like
- <codeph>server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=*</codeph>.
- </li>
- </ul>
- </note>
- <table>
- <tgroup cols="4">
- <colspec colnum="1" colname="col1" colwidth="1.31*"/>
- <colspec colnum="2" colname="col2" colwidth="1.17*"/>
- <colspec colnum="3" colname="col3" colwidth="1*"/>
- <colspec colname="newCol4" colnum="4" colwidth="1*"/>
- <thead>
- <row>
- <entry>
- Operation
- </entry>
- <entry>
- Scope
- </entry>
- <entry>
- Privileges
- </entry>
- <entry>
- URI
- </entry>
- </row>
- </thead>
- <tbody>
- <row conref="../shared/impala_common.xml#common/explain_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/load_data_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/create_database_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/drop_database_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/create_table_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/drop_table_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/describe_table_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/alter_table_add_columns_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/alter_table_replace_columns_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/alter_table_change_column_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/alter_table_rename_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/alter_table_set_tblproperties_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/alter_table_set_fileformat_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/alter_table_set_location_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/alter_table_add_partition_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/alter_table_add_partition_location_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/alter_table_drop_partition_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/alter_table_partition_set_fileformat_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/alter_table_set_serdeproperties_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/create_view_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/drop_view_privs">
- <entry/>
- </row>
- <row id="alter_view_privs">
- <entry>
- ALTER VIEW
- </entry>
- <entry rev="2.3.0 collevelauth">
- You need <codeph>ALL</codeph> privilege on the named view <ph rev="1.4.0">and the parent
- database</ph>, plus <codeph>SELECT</codeph> privilege for any tables or views referenced by the
- view query. Once the view is created or altered by a high-privileged system administrator, it can
- be queried by a lower-privileged user who does not have full query privileges for the base tables.
- </entry>
- <entry>
- ALL, SELECT
- </entry>
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/alter_table_set_location_privs">
- <entry/>
- </row>
- <row id="create_external_table_privs">
- <entry>
- CREATE EXTERNAL TABLE
- </entry>
- <entry>
- Database (ALL), URI (SELECT)
- </entry>
- <entry>
- ALL, SELECT
- </entry>
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/select_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/use_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/create_function_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/drop_function_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/refresh_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/invalidate_metadata_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/invalidate_metadata_table_privs">
- <entry/>
- </row>
- <row conref="../shared/impala_common.xml#common/compute_stats_privs">
- <entry/>
- </row>
- <row id="show_table_stats_privs">
- <entry>
- SHOW TABLE STATS, SHOW PARTITIONS
- </entry>
- <entry>
- TABLE
- </entry>
- <entry>
- SELECT/INSERT
- </entry>
- <entry/>
- </row>
- <row>
- <entry id="show_column_stats_privs">
- SHOW COLUMN STATS
- </entry>
- <entry>
- TABLE
- </entry>
- <entry>
- SELECT/INSERT
- </entry>
- <entry/>
- </row>
- <row>
- <entry id="show_functions_privs">
- SHOW FUNCTIONS
- </entry>
- <entry>
- DATABASE
- </entry>
- <entry>
- SELECT
- </entry>
- <entry/>
- </row>
- <row id="show_tables_privs">
- <entry>
- SHOW TABLES
- </entry>
- <entry/>
- <entry>
- No special privileges needed to issue the statement, but only shows objects you are authorized for
- </entry>
- <entry/>
- </row>
- <row id="show_databases_privs">
- <entry>
- SHOW DATABASES, SHOW SCHEMAS
- </entry>
- <entry/>
- <entry>
- No special privileges needed to issue the statement, but only shows objects you are authorized for
- </entry>
- <entry/>
- </row>
- </tbody>
- </tgroup>
- </table>
-
- </conbody>
- </concept>
-
<concept id="sentry_debug">
-
- <title><ph conref="../shared/impala_common.xml#common/title_sentry_debug"/></title>
-
+ <title><ph conref="../shared/impala_common.xml#common/title_sentry_debug"
+ /></title>
<conbody>
-
<p conref="../shared/impala_common.xml#common/sentry_debug"/>
</conbody>
</concept>
-
<concept id="sec_ex_default">
-
<title>The DEFAULT Database in a Secure Deployment</title>
-
<conbody>
-
- <p>
- Because of the extra emphasis on granular access controls in a secure deployment, you should move any
- important or sensitive information out of the <codeph>DEFAULT</codeph> database into a named database whose
- privileges are specified in the policy file. Sometimes you might need to give privileges on the
- <codeph>DEFAULT</codeph> database for administrative reasons; for example, as a place you can reliably
- specify with a <codeph>USE</codeph> statement when preparing to drop a database.
+ <p> Because of the extra emphasis on granular access controls in a secure
+ deployment, you should move any important or sensitive information out
+ of the <codeph>DEFAULT</codeph> database into a named database whose
+ privileges are specified in the policy file. Sometimes you might need to
+ give privileges on the <codeph>DEFAULT</codeph> database for
+ administrative reasons; for example, as a place you can reliably specify
+ with a <codeph>USE</codeph> statement when preparing to drop a database.
</p>
-
-<!-- Maybe have an example later, but not for initial 1.1 release.
-<codeblock></codeblock>
--->
</conbody>
</concept>
</concept>