You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ar...@apache.org on 2019/05/29 23:28:35 UTC
[impala] 02/02: IMPALA-8049: [DOCS] Ranger authz support in impala
This is an automated email from the ASF dual-hosted git repository.
arodoni pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git
commit a7b8c1e9574afba4385d4518713e412bdeaedb8c
Author: Alex Rodoni <ar...@cloudera.com>
AuthorDate: Fri May 17 13:59:10 2019 -0700
IMPALA-8049: [DOCS] Ranger authz support in impala
Change-Id: I4858bc49c1ed6d5e65ddbaebc96e56427446bad6
Reviewed-on: http://gerrit.cloudera.org:8080/13368
Reviewed-by: Fredy Wijaya <fw...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
docs/topics/impala_authorization.xml | 184 +++++++++++++++++++---------------
docs/topics/impala_config_options.xml | 11 +-
2 files changed, 109 insertions(+), 86 deletions(-)
diff --git a/docs/topics/impala_authorization.xml b/docs/topics/impala_authorization.xml
index a2b7399..c49fa97 100644
--- a/docs/topics/impala_authorization.xml
+++ b/docs/topics/impala_authorization.xml
@@ -20,7 +20,7 @@ under the License.
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
<concept rev="1.1" id="authorization">
- <title>Enabling Sentry Authorization for Impala</title>
+ <title>Impala Authorization</title>
<prolog>
<metadata>
@@ -39,33 +39,24 @@ under the License.
<p>
Authorization determines which users are allowed to access which resources, and what
- operations they are allowed to perform. In Impala 1.1 and higher, you use Apache Sentry
- for authorization. Sentry adds a fine-grained authorization framework for Hadoop. By
- default (when authorization is not enabled), Impala does all read and write operations
- with the privileges of the <codeph>impala</codeph> user, which is suitable for a
- development/test environment but not for a secure production environment. When
- authorization is enabled, Impala uses the OS user ID of the user who runs
- <cmdname>impala-shell</cmdname> or other client program, and associates various privileges
- with each user.
+ operations they are allowed to perform. You use Apache Sentry or Apache Ranger for
+ authorization. By default, when authorization is not enabled, Impala does all read and
+ write operations with the privileges of the <codeph>impala</codeph> user, which is
+ suitable for a development/test environment but not for a secure production environment.
+ When authorization is enabled, Impala uses the OS user ID of the user who runs
+ <cmdname>impala-shell</cmdname> or other client programs, and associates various
+ privileges with each user.
</p>
- <note>
- Sentry is typically used in conjunction with Kerberos authentication, which defines which
- hosts are allowed to connect to each server. Using the combination of Sentry and Kerberos
- prevents malicious users from being able to connect by creating a named account on an
- untrusted machine. See <xref href="impala_kerberos.xml#kerberos"/> for details about
- Kerberos authentication.
- </note>
-
<p audience="PDF" outputclass="toc inpage">
- See the following sections for details about using the Impala authorization features:
+ See the following sections for details about using the Impala authorization features.
</p>
</conbody>
<concept id="sentry_priv_model">
- <title>The Sentry Privilege Model</title>
+ <title>The Privilege Model</title>
<conbody>
@@ -99,16 +90,17 @@ under the License.
<p conref="../shared/impala_common.xml#common/sentry_privileges_objects"/>
- <p> Privileges are managed via the <codeph>GRANT</codeph> and
- <codeph>REVOKE</codeph> SQL statements that requires the Sentry
- service enabled. The Sentry service stores, retrieves, and manipulates
- privilege information stored inside the Sentry database. </p>
+ <p>
+ Privileges are managed via the <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> SQL
+ statements that require the Sentry or Ranger service enabled.
+ </p>
- <p> If you change privileges outside of Impala, e.g. adding a user,
- removing a user, modifying privileges, you must clear the Impala Catalog
- server cache by running the <codeph>REFRESH AUTHORIZATION</codeph>
- statement. <codeph>REFRESH AUTHORIZATION</codeph> is not required if you
- make the changes to privileges within Impala. </p>
+ <p>
+ If you change privileges outside of Impala, e.g. adding a user, removing a user,
+ modifying privileges, you must clear the Impala Catalog server cache by running the
+ <codeph>REFRESH AUTHORIZATION</codeph> statement. <codeph>REFRESH AUTHORIZATION</codeph>
+ is not required if you make the changes to privileges within Impala.
+ </p>
</conbody>
@@ -116,7 +108,7 @@ under the License.
<concept id="secure_startup">
- <title>Starting the impalad Daemon with Sentry Authorization Enabled</title>
+ <title>Starting Impala with Sentry Authorization Enabled</title>
<prolog>
<metadata>
@@ -127,65 +119,91 @@ under the License.
<conbody>
<p>
- To run the <cmdname>impalad</cmdname> daemon with authorization enabled, you add one or
- more options to the <codeph>IMPALA_SERVER_ARGS</codeph> declaration in the
- <filepath>/etc/default/impala</filepath> configuration file:
+ To enable authorization in an Impala cluster using Sentry:
+ <ol>
+ <li>
+ Add the following options to the <codeph>IMPALA_SERVER_ARGS</codeph> and the
+ <codeph>IMPALA_CATALOG_ARGS</codeph> settings in the
+ <filepath>/etc/default/impala</filepath> configuration file:
+ <ul>
+ <li>
+ <codeph>-server_name</codeph>: For all <cmdname>impalad</cmdname> nodes and the
+ <codeph>catalogd</codeph> in the cluster, specify the same name set in the
+ <codeph>sentry.hive.server</codeph> property in the
+ <filepath>sentry-site.xml</filepath> configuration file for Hive.
+ </li>
+
+ <li>
+ <codeph>-sentry_config</codeph>: Specifies the local path to the
+ <codeph>sentry-site.xml</codeph> configuration file.
+ </li>
+ </ul>
+ </li>
+
+ <li>
+ Restart the <codeph>catalogd</codeph> and all <cmdname>impalad</cmdname> daemons.
+ </li>
+ </ol>
</p>
- <ul>
- <li>
- <codeph>-server_name</codeph>: Turns on Sentry authorization for Impala. The
- authorization rules refer to a symbolic server name, and you specify the same name to
- use as the argument to the <codeph>-server_name</codeph> option for all
- <cmdname>impalad</cmdname> nodes in the cluster.
- </li>
+ </conbody>
- <li>
- <codeph>-sentry_config</codeph>: Specifies the local path to the
- <codeph>sentry-site.xml</codeph> configuration file. This setting is required to
- enable authorization.
- </li>
- </ul>
+ </concept>
- <p rev="1.4.0">
- For example, you might adapt your <filepath>/etc/default/impala</filepath> configuration
- to contain lines like the following. To use the Sentry service:
- </p>
+ <concept id="enable_ranger_authz">
-<codeblock rev="1.4.0">IMPALA_SERVER_ARGS=" \
--server_name=server1 \
-...
-</codeblock>
+ <title>Starting Impala with Ranger Authorization Enabled</title>
- <p>
- The preceding examples set up a symbolic name of <codeph>server1</codeph> to refer to
- the current instance of Impala. Specify the symbolic name for the
- <codeph>sentry.hive.server</codeph> property in the <filepath>sentry-site.xml</filepath>
- configuration file for Hive, as well as in the <codeph>-server_name</codeph> option for
- <cmdname>impalad</cmdname>.
- </p>
+ <conbody>
<p>
- Now restart the <cmdname>impalad</cmdname> daemons on all the nodes.
+ To enable authorization in an Impala cluster using Ranger:
</p>
+ <ol>
+ <li>
+ Add the following options to the <codeph>IMPALA_SERVER_ARGS</codeph> and the
+ <codeph>IMPALA_CATALOG_ARGS</codeph> settings in the
+ <filepath>/etc/default/impala</filepath> configuration file:
+ <ul>
+ <li>
+ <codeph>-server_name</codeph>: Specify the same name for all
+ <cmdname>impalad</cmdname> nodes and the <codeph>catalogd</codeph> in the cluster.
+ </li>
+
+ <li>
+ <codeph>-ranger_service_type=hive</codeph>
+ </li>
+
+ <li>
+ <codeph>-ranger_app_id</codeph>: Set it to the Ranger application id.
+ </li>
+
+ <li>
+ <codeph>-authorization_provider=ranger</codeph>
+ </li>
+ </ul>
+ </li>
+
+ <li>
+ Restart the <codeph>catalogd</codeph> and all <cmdname>impalad</cmdname> daemons.
+ </li>
+ </ol>
+
</conbody>
</concept>
<concept id="sentry_service">
- <title>Using Impala with the Sentry Service</title>
+ <title>Managing Privileges</title>
<conbody>
<p>
- When you use the Sentry service, you set up privileges through the
- <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements in either Impala or Hive.
- Then both components use those same privileges automatically. (Impala added the
- <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements in
- <keyword keyref="impala20_full"
- />.)
+ You set up privileges through the <codeph>GRANT</codeph> and <codeph>REVOKE</codeph>
+ statements in either Impala or Hive. Then both components use those same privileges
+ automatically.
</p>
<p>
@@ -200,14 +218,14 @@ under the License.
<concept id="changing_privileges">
- <title>Changing Privileges</title>
+ <title>Changing Privileges from Outside of Impala</title>
<conbody>
<p>
- If you make a change to privileges in Sentry from outside of Impala, e.g. adding a
- user, removing a user, modifying privileges, there are two options to propagate the
- change:
+ If you make a change to privileges in Sentry or Ranger from outside of Impala, e.g.
+ adding a user, removing a user, modifying privileges, there are two options to
+ propagate the change:
</p>
<ul>
@@ -218,9 +236,15 @@ under the License.
</li>
<li>
- Run the <codeph>INVALIDATE METADATA</codeph> statement to force a Sentry refresh.
- <codeph>INVALIDATE METADATA</codeph> forces a Sentry refresh regardless of the
- <codeph>--sentry_catalog_polling_fequency_s</codeph> flag.
+ Use the <codeph>ranger.plugin.hive.policy.pollIntervalMs</codeph> property to
+ specify how often to do a Ranger refresh. The property is specified in
+ <codeph>ranger-hive-security.xml</codeph> in the <codeph>conf</codeph> directory
+ under your Impala home directory.
+ </li>
+
+ <li>
+ Run the <codeph>INVALIDATE METADATA</codeph> or <codeph>REFRESH
+ AUTHORIZATION</codeph> statement to force a refresh.
</li>
</ul>
@@ -366,7 +390,7 @@ GRANT SELECT ON TABLE db1.training TO ROLE student;</codeblock>
<title>Privileges for Working with External Data Files</title>
<p>
- When data is being inserted through the <codeph>LOAD DATA</codeph> statement, or is
+ When data is being inserted through the <codeph>LOAD DATA</codeph> statement or is
referenced from an HDFS location outside the normal Impala database directories, the
user also needs appropriate permissions on the URIs corresponding to those HDFS
locations.
@@ -409,9 +433,9 @@ GRANT ALL ON URI 'hdfs://127.0.0.1:8020/user/impala-user/external_data' TO ROLE
<p>
To create a database, you need the full privilege on that database while day-to-day
operations on tables within that database can be performed with lower levels of
- privilege on specific table. Thus, you might set up separate roles for each database
- or application: an administrative one that could create or drop the database, and a
- user-level one that can access only the relevant tables.
+ privilege on a specific table. Thus, you might set up separate roles for each
+ database or application: an administrative one that could create or drop the
+ database, and a user-level one that can access only the relevant tables.
</p>
<p>
@@ -469,7 +493,7 @@ GRANT SELECT ON TABLE training1.course1 TO ROLE student;</codeblock>
In your role definitions, you must specify privileges at the level of individual
databases and tables, or all databases or all tables within a database. To simplify the
structure of these rules, plan ahead of time how to name your schema objects so that
- data with different authorization requirements is divided into separate databases.
+ data with different authorization requirements are divided into separate databases.
</p>
<p>
diff --git a/docs/topics/impala_config_options.xml b/docs/topics/impala_config_options.xml
index 425b38b..469fa62 100644
--- a/docs/topics/impala_config_options.xml
+++ b/docs/topics/impala_config_options.xml
@@ -194,12 +194,11 @@ Starting Impala Catalog Server: [ OK ]</codeblock>
<li>
<p>
- Authorization using the open source Sentry plugin. Specify the
- <codeph>‑‑server_name</codeph> and
- <codeph>‑‑authorization_policy_file</codeph> options as part of the
- <codeph>IMPALA_SERVER_ARGS</codeph> and <codeph>IMPALA_STATE_STORE_ARGS</codeph>
- settings to enable the core Impala support for authentication. See
- <xref
+ Authorization. Specify the
+ <codeph>‑‑server_name</codeph> option as part of the
+ <codeph>IMPALA_SERVER_ARGS</codeph> and
+ <codeph>IMPALA_CATALOG_ARGS</codeph> settings to enable the core
+ Impala support for authorization. See <xref
href="impala_authorization.xml#secure_startup"/> for details.
</p>
</li>