You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by nd...@apache.org on 2015/08/19 00:35:57 UTC
[05/15] hbase git commit: HBASE-14066 clean out old docbook docs from
branch-1.
http://git-wip-us.apache.org/repos/asf/hbase/blob/0acbff24/src/main/docbkx/security.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/security.xml b/src/main/docbkx/security.xml
deleted file mode 100644
index d649f95..0000000
--- a/src/main/docbkx/security.xml
+++ /dev/null
@@ -1,1895 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<chapter
- version="5.0"
- xml:id="security"
- xmlns="http://docbook.org/ns/docbook"
- xmlns:xlink="http://www.w3.org/1999/xlink"
- xmlns:xi="http://www.w3.org/2001/XInclude"
- xmlns:svg="http://www.w3.org/2000/svg"
- xmlns:m="http://www.w3.org/1998/Math/MathML"
- xmlns:html="http://www.w3.org/1999/xhtml"
- xmlns:db="http://docbook.org/ns/docbook">
- <!--
-/**
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements. See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership. The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License. You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
--->
- <title>Secure Apache HBase</title>
- <section
- xml:id="hbase.secure.configuration">
- <title>Secure Client Access to Apache HBase</title>
- <para>Newer releases of Apache HBase (>= 0.92) support optional SASL authentication of clients. See also Matteo Bertozzi's article on <link
- xlink:href="http://www.cloudera.com/blog/2012/09/understanding-user-authentication-and-authorization-in-apache-hbase/">Understanding
- User Authentication and Authorization in Apache HBase</link>.</para>
- <para>This describes how to set up Apache HBase and clients for connection to secure HBase
- resources.</para>
-
- <section xml:id="security.prerequisites">
- <title>Prerequisites</title>
- <variablelist>
- <varlistentry>
- <term>Hadoop Authentication Configuration</term>
- <listitem>
- <para>To run HBase RPC with strong authentication, you must set
- <code>hbase.security.authentication</code> to <literal>true</literal>. In this case,
- you must also set <code>hadoop.security.authentication</code> to
- <literal>true</literal>. Otherwise, you would be using strong authentication for
- HBase but not for the underlying HDFS, which would cancel out any benefit.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Kerberos KDC</term>
- <listitem>
- <para> You need to have a working Kerberos KDC. </para>
- <para> A HBase configured for secure client access is expected to be running on top of a
- secured HDFS cluster. HBase must be able to authenticate to HDFS services. HBase needs
- Kerberos credentials to interact with the Kerberos-enabled HDFS daemons.
- Authenticating a service should be done using a keytab file. The procedure for
- creating keytabs for HBase service is the same as for creating keytabs for Hadoop.
- Those steps are omitted here. Copy the resulting keytab files to wherever HBase Master
- and RegionServer processes are deployed and make them readable only to the user
- account under which the HBase daemons will run. </para>
- <para> A Kerberos principal has three parts, with the form
- <code>username/fully.qualified.domain.name@YOUR-REALM.COM</code>. We recommend using
- <code>hbase</code> as the username portion. </para>
- <para> The following is an example of the configuration properties for Kerberos
- operation that must be added to the <code>hbase-site.xml</code> file on every server
- machine in the cluster. Required for even the most basic interactions with a secure
- Hadoop configuration, independent of HBase security. </para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.regionserver.kerberos.principal</name>
- <value>hbase/_HOST@YOUR-REALM.COM</value>
-</property>
-<property>
- <name>hbase.regionserver.keytab.file</name>
- <value>/etc/hbase/conf/keytab.krb5</value>
-</property>
-<property>
- <name>hbase.master.kerberos.principal</name>
- <value>hbase/_HOST@YOUR-REALM.COM</value>
-</property>
-<property>
- <name>hbase.master.keytab.file</name>
- <value>/etc/hbase/conf/keytab.krb5</value>
-</property>
- ]]></programlisting>
- <para> Each HBase client user should also be given a Kerberos principal. This principal
- should have a password assigned to it (as opposed to a keytab file). The client
- principal's <code>maxrenewlife</code> should be set so that it can be renewed enough
- times for the HBase client process to complete. For example, if a user runs a
- long-running HBase client process that takes at most 3 days, we might create this
- user's principal within <code>kadmin</code> with: <code>addprinc -maxrenewlife
- 3days</code>
- </para>
- <para> Long running daemons with indefinite lifetimes that require client access to
- HBase can instead be configured to log in from a keytab. For each host running such
- daemons, create a keytab with <code>kadmin</code> or <code>kadmin.local</code>. The
- procedure for creating keytabs for HBase service is the same as for creating keytabs
- for Hadoop. Those steps are omitted here. Copy the resulting keytab files to where the
- client daemon will execute and make them readable only to the user account under which
- the daemon will run. </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
-
- <section>
- <title>Server-side Configuration for Secure Operation</title>
- <para>First, refer to <xref linkend="security.prerequisites" /> and ensure that your
- underlying HDFS configuration is secure.</para>
- <para> Add the following to the <code>hbase-site.xml</code> file on every server machine in
- the cluster: </para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.security.authentication</name>
- <value>kerberos</value>
-</property>
-<property>
- <name>hbase.security.authorization</name>
- <value>true</value>
-</property>
-<property>
-<name>hbase.coprocessor.region.classes</name>
- <value>org.apache.hadoop.hbase.security.token.TokenProvider</value>
-</property>
- ]]></programlisting>
- <para> A full shutdown and restart of HBase service is required when deploying these
- configuration changes. </para>
- </section>
-
- <section>
- <title>Client-side Configuration for Secure Operation</title>
- <para>First, refer to <xref linkend="security.prerequisites" /> and ensure that your
- underlying HDFS configuration is secure.</para>
- <para> Add the following to the <code>hbase-site.xml</code> file on every client: </para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.security.authentication</name>
- <value>kerberos</value>
-</property>
- ]]></programlisting>
- <para> The client environment must be logged in to Kerberos from KDC or keytab via the
- <code>kinit</code> command before communication with the HBase cluster will be possible. </para>
- <para> Be advised that if the <code>hbase.security.authentication</code> in the client- and
- server-side site files do not match, the client will not be able to communicate with the
- cluster. </para>
- <para> Once HBase is configured for secure RPC it is possible to optionally configure
- encrypted communication. To do so, add the following to the <code>hbase-site.xml</code> file
- on every client: </para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.rpc.protection</name>
- <value>privacy</value>
-</property>
- ]]></programlisting>
- <para> This configuration property can also be set on a per connection basis. Set it in the
- <code>Configuration</code> supplied to <code>HTable</code>: </para>
- <programlisting language="java">
-Configuration conf = HBaseConfiguration.create();
-conf.set("hbase.rpc.protection", "privacy");
-HTable table = new HTable(conf, tablename);
- </programlisting>
- <para> Expect a ~10% performance penalty for encrypted communication. </para>
- </section>
-
-
- <section xml:id="security.client.thrift">
- <title>Client-side Configuration for Secure Operation - Thrift Gateway</title>
- <para> Add the following to the <code>hbase-site.xml</code> file for every Thrift gateway: <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.thrift.keytab.file</name>
- <value>/etc/hbase/conf/hbase.keytab</value>
-</property>
-<property>
- <name>hbase.thrift.kerberos.principal</name>
- <value>$USER/_HOST@HADOOP.LOCALDOMAIN</value>
- <!-- TODO: This may need to be HTTP/_HOST@<REALM> and _HOST may not work.
- You may have to put the concrete full hostname.
- -->
-</property>
- ]]></programlisting>
- </para>
- <para> Substitute the appropriate credential and keytab for <replaceable>$USER</replaceable>
- and <replaceable>$KEYTAB</replaceable> respectively. </para>
- <para>In order to use the Thrift API principal to interact with HBase, it is also necessary to
- add the <code>hbase.thrift.kerberos.principal</code> to the <code>_acl_</code> table. For
- example, to give the Thrift API principal, <code>thrift_server</code>, administrative
- access, a command such as this one will suffice: </para>
- <programlisting language="sql"><![CDATA[
-grant 'thrift_server', 'RWCA'
- ]]></programlisting>
- <para>For more information about ACLs, please see the <link
- linkend="hbase.accesscontrol.configuration">Access Control</link> section </para>
-
- <para> The Thrift gateway will authenticate with HBase using the supplied credential. No
- authentication will be performed by the Thrift gateway itself. All client access via the
- Thrift gateway will use the Thrift gateway's credential and have its privilege. </para>
- </section>
- <section xml:id="security.gateway.thrift">
- <title>Configure the Thrift Gateway to Authenticate on Behalf of the Client</title>
- <para><xref linkend="security.client.thrift"/> describes how to authenticate a Thrift client
- to HBase using a fixed user. As an alternative, you can configure the Thrift gateway to
- authenticate to HBase on the client's behalf, and to access HBase using a proxy user. This
- was implemented in <link xlink:href="https://issues.apache.org/jira/browse/HBASE-11349"
- >HBASE-11349</link> for Thrift 1, and <link
- xlink:href="https://issues.apache.org/jira/browse/HBASE-11474">HBASE-11474</link> for
- Thrift 2.</para>
- <note>
- <title>Limitations with Thrift Framed Transport</title>
- <para>If you use framed transport, you cannot yet take advantage of this feature, because
- SASL does not work with Thrift framed transport at this time.</para>
- </note>
- <para>To enable it, do the following.</para>
- <procedure>
- <step>
- <para>Be sure Thrift is running in secure mode, by following the procedure described in
- <xref linkend="security.client.thrift"/>.</para>
- </step>
- <step>
- <para>Be sure that HBase is configured to allow proxy users, as described in <xref
- linkend="security.rest.gateway"/>.</para>
- </step>
- <step>
- <para>In <filename>hbase-site.xml</filename> for each cluster node running a Thrift
- gateway, set the property <code>hbase.thrift.security.qop</code> to one of the following
- three values:</para>
- <itemizedlist>
- <listitem>
- <para><literal>auth-conf</literal> - authentication, integrity, and confidentiality
- checking</para>
- </listitem>
- <listitem>
- <para><literal>auth-int</literal> - authentication and integrity checking</para>
- </listitem>
- <listitem>
- <para><literal>auth</literal> - authentication checking only</para>
- </listitem>
- </itemizedlist>
- </step>
- <step>
- <para>Restart the Thrift gateway processes for the changes to take effect. If a node is
- running Thrift, the output of the <command>jps</command> command will list a
- <code>ThriftServer</code> process. To stop Thrift on a node, run the command
- <command>bin/hbase-daemon.sh stop thrift</command>. To start Thrift on a node, run the
- command <command>bin/hbase-daemon.sh start thrift</command>.</para>
- </step>
- </procedure>
- </section>
-
- <section>
- <title>Client-side Configuration for Secure Operation - REST Gateway</title>
- <para> Add the following to the <code>hbase-site.xml</code> file for every REST gateway: </para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.rest.keytab.file</name>
- <value>$KEYTAB</value>
-</property>
-<property>
- <name>hbase.rest.kerberos.principal</name>
- <value>$USER/_HOST@HADOOP.LOCALDOMAIN</value>
-</property>
- ]]></programlisting>
- <para> Substitute the appropriate credential and keytab for <replaceable>$USER</replaceable>
- and <replaceable>$KEYTAB</replaceable> respectively. </para>
- <para> The REST gateway will authenticate with HBase using the supplied credential. No
- authentication will be performed by the REST gateway itself. All client access via the REST
- gateway will use the REST gateway's credential and have its privilege. </para>
- <para>In order to use the REST API principal to interact with HBase, it is also necessary to
- add the <code>hbase.rest.kerberos.principal</code> to the <code>_acl_</code> table. For
- example, to give the REST API principal, <code>rest_server</code>, administrative access, a
- command such as this one will suffice: </para>
- <programlisting language="sql"><![CDATA[
-grant 'rest_server', 'RWCA'
- ]]></programlisting>
- <para>For more information about ACLs, please see the <link
- linkend="hbase.accesscontrol.configuration">Access Control</link> section </para>
- <para> It should be possible for clients to authenticate with the HBase cluster through the
- REST gateway in a pass-through manner via SPEGNO HTTP authentication. This is future work.
- </para>
- </section>
-
- <section xml:id="security.rest.gateway">
- <title>REST Gateway Impersonation Configuration</title>
- <para> By default, the REST gateway doesn't support impersonation. It accesses the HBase on
- behalf of clients as the user configured as in the previous section. To the HBase server,
- all requests are from the REST gateway user. The actual users are unknown. You can turn on
- the impersonation support. With impersonation, the REST gateway user is a proxy user. The
- HBase server knows the acutal/real user of each request. So it can apply proper
- authorizations. </para>
- <para> To turn on REST gateway impersonation, we need to configure HBase servers (masters and
- region servers) to allow proxy users; configure REST gateway to enable impersonation. </para>
- <para> To allow proxy users, add the following to the <code>hbase-site.xml</code> file for
- every HBase server: </para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hadoop.security.authorization</name>
- <value>true</value>
-</property>
-<property>
- <name>hadoop.proxyuser.$USER.groups</name>
- <value>$GROUPS</value>
-</property>
-<property>
- <name>hadoop.proxyuser.$USER.hosts</name>
- <value>$GROUPS</value>
-</property>
- ]]></programlisting>
- <para> Substitute the REST gateway proxy user for $USER, and the allowed group list for
- $GROUPS. </para>
- <para> To enable REST gateway impersonation, add the following to the
- <code>hbase-site.xml</code> file for every REST gateway. </para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.rest.authentication.type</name>
- <value>kerberos</value>
-</property>
-<property>
- <name>hbase.rest.authentication.kerberos.principal</name>
- <value>HTTP/_HOST@HADOOP.LOCALDOMAIN</value>
-</property>
-<property>
- <name>hbase.rest.authentication.kerberos.keytab</name>
- <value>$KEYTAB</value>
-</property>
- ]]></programlisting>
- <para> Substitute the keytab for HTTP for $KEYTAB. </para>
- </section>
-
- </section>
- <!-- Secure Client Access to HBase -->
-
- <section
- xml:id="hbase.secure.simpleconfiguration">
- <title>Simple User Access to Apache HBase</title>
- <para>Newer releases of Apache HBase (>= 0.92) support optional SASL authentication of clients. See also Matteo Bertozzi's article on <link
- xlink:href="http://www.cloudera.com/blog/2012/09/understanding-user-authentication-and-authorization-in-apache-hbase/">Understanding
- User Authentication and Authorization in Apache HBase</link>.</para>
- <para>This describes how to set up Apache HBase and clients for simple user access to HBase
- resources.</para>
-
- <section>
- <title>Simple Versus Secure Access</title>
- <para> The following section shows how to set up simple user access. Simple user access is not
- a secure method of operating HBase. This method is used to prevent users from making
- mistakes. It can be used to mimic the Access Control using on a development system without
- having to set up Kerberos. </para>
- <para> This method is not used to prevent malicious or hacking attempts. To make HBase secure
- against these types of attacks, you must configure HBase for secure operation. Refer to the
- section <link
- linkend="hbase.accesscontrol.configuration">Secure Client Access to HBase</link> and
- complete all of the steps described there. </para>
-
- <section>
- <title>Prerequisites</title>
- <para> None </para>
-
- <section>
- <title>Server-side Configuration for Simple User Access Operation</title>
- <para> Add the following to the <code>hbase-site.xml</code> file on every server machine
- in the cluster: </para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.security.authentication</name>
- <value>simple</value>
-</property>
-<property>
- <name>hbase.security.authorization</name>
- <value>true</value>
-</property>
-<property>
- <name>hbase.coprocessor.master.classes</name>
- <value>org.apache.hadoop.hbase.security.access.AccessController</value>
-</property>
-<property>
- <name>hbase.coprocessor.region.classes</name>
- <value>org.apache.hadoop.hbase.security.access.AccessController</value>
-</property>
-<property>
- <name>hbase.coprocessor.regionserver.classes</name>
- <value>org.apache.hadoop.hbase.security.access.AccessController</value>
-</property>
- ]]></programlisting>
- <para> For 0.94, add the following to the <code>hbase-site.xml</code> file on every server
- machine in the cluster: </para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.rpc.engine</name>
- <value>org.apache.hadoop.hbase.ipc.SecureRpcEngine</value>
-</property>
-<property>
- <name>hbase.coprocessor.master.classes</name>
- <value>org.apache.hadoop.hbase.security.access.AccessController</value>
-</property>
-<property>
- <name>hbase.coprocessor.region.classes</name>
- <value>org.apache.hadoop.hbase.security.access.AccessController</value>
-</property>
- ]]></programlisting>
- <para> A full shutdown and restart of HBase service is required when deploying these
- configuration changes. </para>
- </section>
-
- <section>
- <title>Client-side Configuration for Simple User Access Operation</title>
- <para> Add the following to the <code>hbase-site.xml</code> file on every client: </para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.security.authentication</name>
- <value>simple</value>
-</property>
- ]]></programlisting>
- <para> For 0.94, add the following to the <code>hbase-site.xml</code> file on every server
- machine in the cluster: </para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.rpc.engine</name>
- <value>org.apache.hadoop.hbase.ipc.SecureRpcEngine</value>
-</property>
- ]]></programlisting>
- <para> Be advised that if the <code>hbase.security.authentication</code> in the client-
- and server-side site files do not match, the client will not be able to communicate with
- the cluster. </para>
- </section>
-
- <section>
- <title>Client-side Configuration for Simple User Access Operation - Thrift Gateway</title>
- <para>The Thrift gateway user will need access. For example, to give the Thrift API user,
- <code>thrift_server</code>, administrative access, a command such as this one will
- suffice: </para>
- <programlisting language="sql"><![CDATA[
-grant 'thrift_server', 'RWCA'
- ]]></programlisting>
- <para>For more information about ACLs, please see the <link
- linkend="hbase.accesscontrol.configuration">Access Control</link> section </para>
-
- <para> The Thrift gateway will authenticate with HBase using the supplied credential. No
- authentication will be performed by the Thrift gateway itself. All client access via the
- Thrift gateway will use the Thrift gateway's credential and have its privilege. </para>
- </section>
-
- <section>
- <title>Client-side Configuration for Simple User Access Operation - REST Gateway</title>
-
- <para> The REST gateway will authenticate with HBase using the supplied credential. No
- authentication will be performed by the REST gateway itself. All client access via the
- REST gateway will use the REST gateway's credential and have its privilege. </para>
- <para>The REST gateway user will need access. For example, to give the REST API user,
- <code>rest_server</code>, administrative access, a command such as this one will
- suffice: </para>
- <programlisting language="sql"><![CDATA[
-grant 'rest_server', 'RWCA'
- ]]></programlisting>
- <para>For more information about ACLs, please see the <link
- linkend="hbase.accesscontrol.configuration">Access Control</link> section </para>
- <para> It should be possible for clients to authenticate with the HBase cluster through
- the REST gateway in a pass-through manner via SPEGNO HTTP authentication. This is future
- work. </para>
- </section>
- </section>
- </section>
-
- </section>
- <!-- Simple User Access to Apache HBase -->
-
- <section>
- <title>Securing Access To Your Data</title>
- <para>After you have configured secure authentication between HBase client and server processes
- and gateways, you need to consider the security of your data itself. HBase provides several
- strategies for securing your data:</para>
- <itemizedlist>
- <listitem>
- <para>Role-based Access Control (RBAC) controls which users or groups can read and write to
- a given HBase resource or execute a coprocessor endpoint, using the familiar paradigm of
- roles.</para>
- </listitem>
- <listitem>
- <para>Visibility Labels which allow you to label cells and control access to labelled cells,
- to further restrict who can read or write to certain subsets of your data. Visibility
- labels are stored as tags. See <xref linkend="hbase.tags"/> for more information.</para>
- </listitem>
- <listitem>
- <para>Transparent encryption of data at rest on the underlying filesystem, both in HFiles
- and in the WAL. This protects your data at rest from an attacker who has access to the
- underlying filesystem, without the need to change the implementation of the client. It can
- also protect against data leakage from improperly disposed disks, which can be important
- for legal and regulatory compliance.</para>
- </listitem>
- </itemizedlist>
- <para>Server-side configuration, administration, and implementation details of each of these
- features are discussed below, along with any performance trade-offs. An example security
- configuration is given at the end, to show these features all used together, as they might be
- in a real-world scenario.</para>
- <caution>
- <para>All aspects of security in HBase are in active development and evolving rapidly. Any
- strategy you employ for security of your data should be thoroughly tested. In addition, some
- of these features are still in the experimental stage of development. To take advantage of
- many of these features, you must be running HBase 0.98+ and using the HFile v3 file
- format.</para>
- </caution>
-
- <warning>
- <title>Protecting Sensitive Files</title>
- <para>Several procedures in this section require you to copy files between cluster nodes. When
- copying keys, configuration files, or other files containing sensitive strings, use a secure
- method, such as <code>ssh</code>, to avoid leaking sensitive data.</para>
- </warning>
-
- <procedure xml:id="security.data.basic.server.side">
- <title>Basic Server-Side Configuration</title>
- <step>
- <para>Enable HFile v3, by setting <option>hfile.format.version </option>to 3 in
- <filename>hbase-site.xml</filename>. This is the default for HBase 1.0 and
- newer.</para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hfile.format.version</name>
- <value>3</value>
-</property>
- ]]></programlisting>
- </step>
- <step>
- <para>Enable SASL and Kerberos authentication for RPC and ZooKeeper, as described in <xref
- linkend="security.prerequisites"/> and <xref linkend="zk.sasl.auth"/>.</para>
- </step>
- </procedure>
-
- <section xml:id="hbase.tags">
- <title>Tags</title>
- <para><firstterm>Tags</firstterm> are a feature of HFile v3. A tag is a piece of metadata
- which is part of a cell, separate from the key, value, and version. Tags are an
- implementation detail which provides a foundation for other security-related features such
- as cell-level ACLs and visibility labels. Tags are stored in the HFiles themselves. It is
- possible that in the future, tags will be used to implement other HBase features. You don't
- need to know a lot about tags in order to use the security features they enable.</para>
- <section>
- <title>Implementation Details</title>
- <para> Every cell can have zero or more tags. Every tag has a type and the actual tag byte
- array.</para>
- <para> Just as row keys, column families, qualifiers and values can be encoded (see <xref
- linkend="data.block.encoding.types"/>), tags can also be encoded as well. You can enable
- or disable tag encoding at the level of the column family, and it is enabled by default.
- Use the <code>HColumnDescriptor#setCompressionTags(boolean compressTags)</code> method to
- manage encoding settings on a column family. You also need to enable the DataBlockEncoder
- for the column family, for encoding of tags to take effect.</para>
- <para>You can enable compression of each tag in the WAL, if WAL compression is also enabled,
- by setting the value of <option>hbase.regionserver.wal.tags.enablecompression</option> to
- <literal>true</literal> in <filename>hbase-site.xml</filename>. Tag compression uses
- dictionary encoding.</para>
- <para>Tag compression is not supported when using WAL encryption.</para>
- </section>
- </section>
-
- <section xml:id="hbase.accesscontrol.configuration">
- <title>Access Control Labels (ACLs)</title>
- <section>
- <title>How It Works</title>
- <para>ACLs in HBase are based upon a user's membership in or exclusion from groups, and a
- given group's permissions to access a given resource. ACLs are implemented as a
- coprocessor called AccessController.</para>
- <para>HBase does not maintain a private group mapping, but relies on a <firstterm>Hadoop
- group mapper</firstterm>, which maps between entities in a directory such as LDAP or
- Active Directory, and HBase users. Any supported Hadoop group mapper will work. Users are
- then granted specific permissions (Read, Write, Execute, Create, Admin) against resources
- (global, namespaces, tables, cells, or endpoints).</para>
- <note>
- <para> With Kerberos and Access Control enabled, client access to HBase is authenticated
- and user data is private unless access has been explicitly granted.</para>
- </note>
- <para>HBase has a simpler security model than relational databases, especially in terms of
- client operations. No distinction is made between an insert (new record) and update (of
- existing record), for example, as both collapse down into a Put.</para>
- <section>
- <title>Understanding Access Levels</title>
- <para>HBase access levels are granted independently of each other and allow for different
- types of operations at a given scope.</para>
- <itemizedlist>
- <listitem>
- <para>Read (R) - can read data at the given scope</para>
- </listitem>
- <listitem>
- <para><command>Write (W)</command> - can write data at the given scope</para>
- </listitem>
- <listitem>
- <para><command>Execute (X)</command> - can execute coprocessor endpoints at the given
- scope</para>
- </listitem>
- <listitem>
- <para><command>Create (C)</command> - can create tables or drop tables (even those
- they did not create) at the given scope</para>
- </listitem>
- <listitem>
- <para><command>Admin (A)</command> - can perform cluster operations such as balancing
- the cluster or assigning regions at the given scope</para>
- </listitem>
- </itemizedlist>
- <para>The possible scopes are:</para>
- <itemizedlist>
- <listitem>
- <para><command>Superuser</command> - superusers can perform any operation available in
- HBase, to any resource. The user who runs HBase on your cluster is a superuser, as
- are any principals assigned to the configuration property
- <code>hbase.superuser</code> in <filename>hbase-site.xml</filename> on the
- HMaster.</para>
- </listitem>
- <listitem>
- <para><command>Global</command> - permissions granted at <filename>global</filename>
- scope allow the admin to operate on all tables of the cluster.</para>
- </listitem>
- <listitem>
- <para><command>Namespace</command> - permissions granted at
- <filename>namespace</filename> scope apply to all tables within a given
- namespace.</para>
- </listitem>
- <listitem>
- <para><command>Table</command> - permissions granted at <filename>table</filename>
- scope apply to data or metadata within a given table.</para>
- </listitem>
- <listitem>
- <para><command>ColumnFamily</command> - permissions granted at
- <filename>ColumnFamily</filename> scope apply to cells within that
- ColumnFamily.</para>
- </listitem>
- <listitem>
- <para><command>Cell</command> - permissions granted at <filename>cell</filename> scope
- apply to that exact cell coordinate (key, value, timestamp). This allows for policy
- evolution along with data.</para>
- <para>To change an ACL on a specific cell, write an updated cell with new ACL to the
- precise coordinates of the original.</para>
- <para>If you have a multi-versioned schema and want to update ACLs on all visible
- versions, you need to write new cells for all visible versions. The application
- has complete control over policy evolution.</para>
- <para>The exception to the above rule is <code>append</code> and
- <code>increment</code> processing. Appends and increments can carry an ACL in the
- operation. If one is included in the operation, then it will be applied to the
- result of the <code>append</code> or <code>increment</code>. Otherwise, the ACL of
- the existing cell you are appending to or incrementing is preserved.</para>
- </listitem>
- </itemizedlist>
- <para>The combination of access levels and scopes creates a matrix of possible access
- levels that can be granted to a user. In a production environment, it is useful to think
- of access levels in terms of what is needed to do a specific job. The following list
- describes appropriate access levels for some common types of HBase users. It is
- important not to grant more access than is required for a given user to perform their
- required tasks.</para>
- <itemizedlist>
- <listitem>
- <para>Superusers - In a production system, only the HBase user should have superuser
- access. In a development environment, an administrator may need superuser access in
- order to quickly control and manage the cluster. However, this type of administrator
- should usually be a Global Admin rather than a superuser.</para>
- </listitem>
- <listitem>
- <para>Global Admins - A global admin can perform tasks and access every table in
- HBase. In a typical production environment, an admin should not have Read or Write
- permissions to data within tables.</para>
- <itemizedlist>
- <listitem>
- <para>A global admin with Admin permissions can perform cluster-wide operations on
- the cluster, such as balancing, assigning or unassigning regions, or calling an
- explicit major compaction. This is an operations role.</para>
- </listitem>
- <listitem>
- <para>A global admin with Create permissions can create or drop any table within
- HBase. This is more of a DBA-type role.</para>
- </listitem>
- </itemizedlist>
- <para>In a production environment, it is likely that different users will have only
- one of Admin and Create permissions.</para>
- <warning>
- <para>In the current implementation, a Global Admin with <code>Admin</code>
- permission can grant himself <code>Read</code> and <code>Write</code> permissions
- on a table and gain access to that table's data. For this reason, only grant
- <code>Global Admin</code> permissions to trusted user who actually need
- them.</para>
- <para>Also be aware that a <code>Global Admin</code> with <code>Create</code>
- permission can perform a <code>Put</code> operation on the ACL table, simulating a
- <code>grant</code> or <code>revoke</code> and circumventing the authorization
- check for <code>Global Admin</code> permissions.</para>
- <para>Due to these issues, be cautious with granting <code>Global Admin</code>
- privileges.</para>
- </warning>
- </listitem>
- <listitem>
- <para><command>Namespace Admins</command> - a namespace admin with <code>Create</code>
- permissions can create or drop tables within that namespace, and take and restore
- snapshots. A namespace admin with <code>Admin</code> permissions can perform
- operations such as splits or major compactions on tables within that
- namespace.</para>
- </listitem>
- <listitem>
- <para><command>Table Admins</command> - A table admin can perform administrative
- operations only on that table. A table admin with <code>Create</code> permissions
- can create snapshots from that table or restore that table from a snapshot. A table
- admin with <code>Admin</code> permissions can perform operations such as splits or
- major compactions on that table.</para>
- </listitem>
- <listitem>
- <para><command>Users</command> - Users can read or write data, or both. Users can also
- execute coprocessor endpoints, if given <code>Executable</code> permissions.</para>
- </listitem>
- </itemizedlist>
- <table>
- <title>Real-World Example of Access Levels</title>
- <tgroup cols="4">
- <thead>
- <row>
- <entry>Job Title</entry>
- <entry>Scope</entry>
- <entry>Permissions</entry>
- <entry>Description</entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry><para>Senior Administrator</para></entry>
- <entry><para>Global</para></entry>
- <entry><para>Access, Create</para></entry>
- <entry><para>Manages the cluster and gives access to Junior
- Administrators.</para></entry>
- </row>
- <row>
- <entry><para>Junior Administrator</para></entry>
- <entry><para>Global</para></entry>
- <entry><para>Create</para></entry>
- <entry><para>Creates tables and gives access to Table
- Administrators.</para></entry>
- </row>
- <row>
- <entry><para>Table Administrator</para></entry>
- <entry><para>Table</para></entry>
- <entry><para>Access</para></entry>
- <entry><para>Maintains a table from an operations point of view.</para></entry>
- </row>
- <row>
- <entry><para>Data Analyst</para></entry>
- <entry><para>Table</para></entry>
- <entry><para>Read</para></entry>
- <entry><para>Creates reports from HBase data.</para></entry>
- </row>
- <row>
- <entry><para>Web Application</para></entry>
- <entry><para>Table</para></entry>
- <entry><para>Read, Write</para></entry>
- <entry><para>Puts data into HBase and uses HBase data to perform
- operations.</para></entry>
- </row>
- </tbody>
- </tgroup>
- <caption><para>This table shows how real-world titles might map to HBase permissions in
- a hypothetical company.</para></caption>
-
- </table>
- <formalpara>
- <title>ACL Matrix</title>
- <para>For more details on how ACLs map to specific HBase operations and tasks, see <xref
- linkend="appendix_acl_matrix"/>.</para>
- </formalpara>
- </section>
- <section>
- <title>Implementation Details</title>
- <para>Cell-level ACLs are implemented using tags (see <xref linkend="hbase.tags"/>). In
- order to use cell-level ACLs, you must be using HFile v3 and HBase 0.98 or newer.</para>
- <orderedlist>
- <title>ACL Implementation Caveats</title>
- <listitem>
- <para>Files created by HBase are owned by the operating system user running the HBase
- process. To interact with HBase files, you should use the API or bulk load
- facility.</para>
- </listitem>
- <listitem>
- <para>HBase does not model "roles" internally in HBase. Instead, group names can be
- granted permissions. This allows external modeling of roles via group membership.
- Groups are created and manipulated externally to HBase, via the Hadoop group mapping
- service.</para>
- </listitem>
- </orderedlist>
- </section>
- <section>
- <title>Server-Side Configuration</title>
- <procedure>
- <step>
- <para>As a prerequisite, perform the steps in <xref
- linkend="security.data.basic.server.side"/>.</para>
- </step>
- <step>
- <para>Install and configure the AccessController coprocessor, by setting the following
- properties in <filename>hbase-site.xml</filename>. These properties take a list of
- classes. </para>
- <note>
- <para>If you use the AccessController along with the VisibilityController, the
- AccessController must come first in the list, because with both components active,
- the VisibilityController will delegate access control on its system tables to the
- AccessController. For an example of using both together, see <xref
- linkend="security.example.config"/>.</para>
- </note>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.coprocessor.region.classes</name>
- <value>org.apache.hadoop.hbase.security.access.AccessController, org.apache.hadoop.hbase.security.token.TokenProvider</value>
-</property>
-<property>
- <name>hbase.coprocessor.master.classes</name>
- <value>org.apache.hadoop.hbase.security.access.AccessController</value>
-</property>
-<property>
- <name>hbase.coprocessor.regionserver.classes</name>
- <value>org.apache.hadoop.hbase.security.access.AccessController</value>
-</property>
-<property>
- <name>hbase.security.exec.permission.checks</name>
- <value>true</value>
-</property>
- ]]></programlisting>
- <para>Optionally, you can enable transport security, by setting
- <option>hbase.rpc.protection</option> to <literal>auth-conf</literal>. This
- requires HBase 0.98.4 or newer.</para>
- </step>
- <step>
- <para>Set up the Hadoop group mapper in the Hadoop namenode's
- <filename>core-site.xml</filename>. This is a Hadoop file, not an HBase file.
- Customize it to your site's needs. Following is an example.</para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hadoop.security.group.mapping</name>
- <value>org.apache.hadoop.security.LdapGroupsMapping</value>
-</property>
-
-<property>
- <name>hadoop.security.group.mapping.ldap.url</name>
- <value>ldap://server</value>
-</property>
-
-<property>
- <name>hadoop.security.group.mapping.ldap.bind.user</name>
- <value>Administrator@example-ad.local</value>
-</property>
-
-<property>
- <name>hadoop.security.group.mapping.ldap.bind.password</name>
- <value>****</value>
-</property>
-
-<property>
- <name>hadoop.security.group.mapping.ldap.base</name>
- <value>dc=example-ad,dc=local</value>
-</property>
-
-<property>
- <name>hadoop.security.group.mapping.ldap.search.filter.user</name>
- <value>(&(objectClass=user)(sAMAccountName={0}))</value>
-</property>
-
-<property>
- <name>hadoop.security.group.mapping.ldap.search.filter.group</name>
- <value>(objectClass=group)</value>
-</property>
-
-<property>
- <name>hadoop.security.group.mapping.ldap.search.attr.member</name>
- <value>member</value>
-</property>
-
-<property>
- <name>hadoop.security.group.mapping.ldap.search.attr.group.name</name>
- <value>cn</value>
-</property>]]>
- </programlisting>
- </step>
- <step>
- <para>Optionally, enable the early-out evaluation strategy. Prior to HBase 0.98.0, if
- a user was not granted access to a column family, or at least a column qualifier, an
- AccessDeniedException would be thrown. HBase 0.98.0 removed this exception in order
- to allow cell-level exceptional grants. To restore the old behavior in HBase
- 0.98.0-0.98.6, set <option>hbase.security.access.early_out</option> to
- <literal>true</literal> in <filename>hbase-site.xml</filename>. In HBase 0.98.6,
- the default has been returned to <literal>true</literal>.</para>
- </step>
- <step>
- <para>Distribute your configuration and restart your cluster for changes to take
- effect.</para>
- </step>
- <step>
- <para>To test your configuration, log into HBase Shell as a given user and use the
- <command>whoami</command> command to report the groups your user is part of. In
- this example, the user is reported as being a member of the <code>services</code>
- group.</para>
- <screen>
-hbase> <userinput>whoami</userinput>
-<computeroutput>service (auth:KERBEROS)
- groups: services</computeroutput>
- </screen>
- </step>
- </procedure>
- </section>
- <section>
- <title>Administration</title>
- <para>Administration tasks can be performed from HBase Shell or via an API.</para>
- <caution>
- <title>API Examples</title>
- <para>Many of the API examples below are taken from source files
- <filename>hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java</filename>
- and
- <filename>hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/SecureTestUtil.java</filename>.</para>
- <para>Neither the examples, nor the source files they are taken from, are part of the
- public HBase API, and are provided for illustration only. Refer to the official API
- for usage instructions.</para>
- </caution>
- <procedure>
- <step>
- <title>User and Group Administration</title>
- <para>Users and groups are maintained external to HBase, in your directory.</para>
- </step>
- <step>
- <title>Granting Access To A Namespace, Table, Column Family, or Cell</title>
- <para>There are a few different types of syntax for grant statements. The first, and
- most familiar, is as follows, with the table and column family being
- optional:</para>
- <screen>grant 'user', 'RWXCA', 'TABLE', 'CF', 'CQ'</screen>
- <para>Groups and users are granted access in the same way, but groups are prefixed
- with an <literal>@</literal> symbol. In the same way, tables and namespaces are
- specified in the same way, but namespaces are prefixed with an <literal>@</literal>
- symbol.</para>
- <para>It is also possible to grant multiple permissions against the same resource in a
- single statement, as in this example. The first sub-clause maps users to ACLs and
- the second sub-clause specifies the resource.</para>
- <note>
- <para>HBase Shell support for granting and revoking access at the cell level is for
- testing and verification support, and should not be employed for production use
- because it won't apply the permissions to cells that don't exist yet. The correct
- way to apply cell level permissions is to do so in the application code when
- storing the values.</para>
- </note>
- <formalpara>
- <title>ACL Granularity and Evaluation Order</title>
- <para>ACLs are evaluated from least granular to most granular, and when an ACL is
- reached that grants permission, evaluation stops. This means that cell ACLs do not
- override ACLs at less granularity.</para>
- </formalpara>
- <example>
- <title>HBase Shell</title>
- <itemizedlist>
- <listitem>
- <para>Global:</para>
- <screen>hbase> <userinput>grant '@admins', 'RWXCA'</userinput></screen>
- </listitem>
- <listitem>
- <para>Namespace:</para>
- <screen>hbase> <userinput>grant 'service', 'RWXCA', '@test-NS'</userinput></screen>
- </listitem>
- <listitem>
- <para>Table:</para>
- <screen>hbase> <userinput>grant 'service', 'RWXCA', 'user'</userinput></screen>
- </listitem>
- <listitem>
- <para>Column Family:</para>
- <screen>hbase> <userinput>grant '@developers', 'RW', 'user', 'i'</userinput></screen>
- </listitem>
- <listitem>
- <para>Column Qualifier:</para>
- <screen>hbase> <userinput>grant 'service, 'RW', 'user', 'i', 'foo'</userinput></screen>
- </listitem>
- <listitem>
- <para>Cell:</para>
- <para>The syntax for granting cell ACLs uses the following syntax:</para>
- <screen>grant <replaceable><table></replaceable>, \
- { '<replaceable><user-or-group></replaceable>' => \
- '<replaceable><permissions></replaceable>', ... }, \
- { <replaceable><scanner-specification></replaceable> }</screen>
- <itemizedlist>
- <listitem>
- <para><replaceable><user-or-group></replaceable> is the user or group
- name, prefixed with <literal>@</literal> in the case of a group.</para>
- </listitem>
- <listitem>
- <para><replaceable><permissions></replaceable> is a string containing
- any or all of "RWXCA", though only R and W are meaningful at cell
- scope.</para>
- </listitem>
- <listitem>
- <para><replaceable><scanner-specification></replaceable> is the
- scanner specification syntax and conventions used by the 'scan' shell
- command. For some examples of scanner specifications, issue the following
- HBase Shell command.</para>
- <screen>hbase> help "scan"</screen>
- </listitem>
- </itemizedlist>
- <para>This example grants read access to the 'testuser' user and read/write
- access to the 'developers' group, on cells in the 'pii' column which match the
- filter.</para>
- <screen>hbase> grant 'user', \
- { '@developers' => 'RW', 'testuser' => 'R' }, \
- { COLUMNS => 'pii', FILTER => "(PrefixFilter ('test'))" }</screen>
- <para>The shell will run a scanner with the given criteria, rewrite the found
- cells with new ACLs, and store them back to their exact coordinates.</para>
- </listitem>
- </itemizedlist>
- </example>
- <example>
- <title>API</title>
- <para>The following example shows how to grant access at the table level.</para>
- <programlisting language="java"><![CDATA[
-public static void grantOnTable(final HBaseTestingUtility util, final String user,
- final TableName table, final byte[] family, final byte[] qualifier,
- final Permission.Action... actions) throws Exception {
- SecureTestUtil.updateACLs(util, new Callable<Void>() {
- @Override
- public Void call() throws Exception {
- HTable acl = new HTable(util.getConfiguration(), AccessControlLists.ACL_TABLE_NAME);
- try {
- BlockingRpcChannel service = acl.coprocessorService(HConstants.EMPTY_START_ROW);
- AccessControlService.BlockingInterface protocol =
- AccessControlService.newBlockingStub(service);
- ProtobufUtil.grant(protocol, user, table, family, qualifier, actions);
- } finally {
- acl.close();
- }
- return null;
- }
- });
-} ]]>
- </programlisting>
- <para>To grant permissions at the cell level, you can use the
- <code>Mutation.setACL</code> method:</para>
- <programlisting language="java"><![CDATA[
-Mutation.setACL(String user, Permission perms)
-Mutation.setACL(Map<String, Permission> perms)
- ]]>
- </programlisting>
- <para>Specifically, this example provides read permission to a user called
- <literal>user1</literal> on any cells contained in a particular Put
- operation:</para>
- <programlisting language="java"><![CDATA[
-put.setACL(“user1”, new Permission(Permission.Action.READ))
- ]]></programlisting>
- </example>
- </step>
- <step>
- <title>Revoking Access Control From a Namespace, Table, Column Family, or Cell</title>
- <para>The <command>revoke</command> command and API are twins of the grant command and
- API, and the syntax is exactly the same. The only exception is that you cannot
- revoke permissions at the cell level. You can only revoke access that has previously
- been granted, and a <command>revoke</command> statement is not the same thing as
- explicit denial to a resource.</para>
- <note>
- <para>HBase Shell support for granting and revoking access is for testing and
- verification support, and should not be employed for production use because it
- won't apply the permissions to cells that don't exist yet. The correct way to
- apply cell-level permissions is to do so in the application code when storing the
- values.</para>
- </note>
- <example>
- <title>Revoking Access To a Table</title>
- <programlisting language="java">
-<![CDATA[public static void revokeFromTable(final HBaseTestingUtility util, final String user,
- final TableName table, final byte[] family, final byte[] qualifier,
- final Permission.Action... actions) throws Exception {
- SecureTestUtil.updateACLs(util, new Callable<Void>() {
- @Override
- public Void call() throws Exception {
- HTable acl = new HTable(util.getConfiguration(), AccessControlLists.ACL_TABLE_NAME);
- try {
- BlockingRpcChannel service = acl.coprocessorService(HConstants.EMPTY_START_ROW);
- AccessControlService.BlockingInterface protocol =
- AccessControlService.newBlockingStub(service);
- ProtobufUtil.revoke(protocol, user, table, family, qualifier, actions);
- } finally {
- acl.close();
- }
- return null;
- }
- });
-} ]]>
- </programlisting>
- </example>
- </step>
- <step>
- <title>Showing a User's Effective Permissions</title>
- <example>
- <title>HBase Shell</title>
- <screen>hbase> user_permission 'user'</screen>
- <screen>hbase> user_permission '.*'</screen>
- <screen>hbase> user_permission <replaceable>JAVA_REGEX</replaceable></screen>
- </example>
- <example>
- <title>API</title>
- <programlisting language="java"><![CDATA[
-public static void verifyAllowed(User user, AccessTestAction action, int count) throws Exception {
- try {
- Object obj = user.runAs(action);
- if (obj != null && obj instanceof List<?>) {
- List<?> results = (List<?>) obj;
- if (results != null && results.isEmpty()) {
- fail("Empty non null results from action for user '" + user.getShortName() + "'");
- }
- assertEquals(count, results.size());
- }
- } catch (AccessDeniedException ade) {
- fail("Expected action to pass for user '" + user.getShortName() + "' but was denied");
- }
-}]]>
- </programlisting>
- </example>
- </step>
- </procedure>
- </section>
- </section>
- </section>
-
- <section>
- <title>Visibility Labels</title>
- <para>Visibility labels control can be used to only permit users or principals associated with
- a given label to read or access cells with that label. For instance, you might label a cell
- <literal>top-secret</literal>, and only grant access to that label to the
- <literal>managers</literal> group. Visibility labels are implemented using Tags, which are
- a feature of HFile v3, and allow you to store metadata on a per-cell basis. A label is a
- string, and labels can be combined into expressions by using logical operators (&, |, or
- !), and using parentheses for grouping. HBase does not do any kind of validation of
- expressions beyond basic well-formedness. Visibility labels have no meaning on their own,
- and may be used to denote sensitivity level, privilege level, or any other arbitrary
- semantic meaning.</para>
- <para>If a user's labels do not match a cell's label or expression, the user is
- denied access to the cell.</para>
- <para>In HBase 0.98.6 and newer, UTF-8 encoding is supported for visibility labels and
- expressions. When creating labels using the <code>addLabels(conf, labels)</code> method
- provided by the <code>org.apache.hadoop.hbase.security.visibility.VisibilityClient</code>
- class and passing labels in Authorizations via Scan or Get, labels can contain UTF-8
- characters, as well as the logical operators normally used in visibility labels, with normal
- Java notations, without needing any escaping method. However, when you pass a CellVisibility
- expression via a Mutation, you must enclose the expression with the
- <code>CellVisibility.quote()</code> method if you use UTF-8 characters or logical
- operators. See <code>TestExpressionParser</code> and the source file
- <filename>hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestScan.java</filename>.
- </para>
- <para>A user adds visibility expressions to a cell during a Put operation. In the default
- configuration, the user does not need to access to a label in order to label cells with it.
- This behavior is controlled by the configuration option
- <option>hbase.security.visibility.mutations.checkauths</option>. If you set this option to
- <literal>true</literal>, the labels the user is modifying as part of the mutation must be
- associated with the user, or the mutation will fail. Whether a user is authorized to read a
- labelled cell is determined during a Get or Scan, and results which the user is not allowed
- to read are filtered out. This incurs the same I/O penalty as if the results were returned,
- but reduces load on the network.</para>
- <para>Visibility labels can also be specified during Delete operations. For details about
- visibility labels and Deletes, see <link
- xlink:href="https://issues.apache.org/jira/browse/HBASE-10885">HBASE-10885</link>. </para>
- <para>The user's effective label set is built in the RPC context when a request is first
- received by the RegionServer. The way that users are associated with labels is pluggable.
- The default plugin passes through labels specified in Authorizations added to the Get or
- Scan and checks those against the calling user's authenticated labels list. When the client
- passes labels for which the user is not authenticated, the default plugin drops them. You
- can pass a subset of user authenticated labels via the
- <code>Get#setAuthorizations(Authorizations(String,...))</code> and
- <code>Scan#setAuthorizations(Authorizations(String,...));</code> methods. </para>
- <para>Visibility label access checking is performed by the VisibilityController coprocessor.
- You can use interface <code>VisibilityLabelService</code> to provide a custom implementation
- and/or control the way that visibility labels are stored with cells. See the source file
- <filename>hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithCustomVisLabService.java</filename>
- for one example.</para>
-
- <para>Visibility labels can be used in conjunction with ACLs.</para>
- <table>
- <title>Examples of Visibility Expressions</title>
- <tgroup cols="2">
- <thead>
- <row>
- <entry>Expression</entry>
- <entry>Interpretation</entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry><screen>fulltime</screen></entry>
- <entry><para>Allow accesss to users associated with the
- <code>fulltime</code> label.</para></entry>
- </row>
- <row>
- <entry><screen>!public</screen></entry>
- <entry><para>Allow access to users not associated with the
- <code>public</code> label.</para></entry>
- </row>
- <row>
- <entry><screen>( secret | topsecret ) & !probationary</screen></entry>
- <entry><para>Allow access to users associated with either the
- <code>secret</code> or <code>topsecret</code> label and not
- associated with the <code>probationary</code> label.</para></entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- <section>
- <title>Server-Side Configuration</title>
- <procedure>
- <step>
- <para>As a prerequisite, perform the steps in <xref
- linkend="security.data.basic.server.side"/>.</para></step>
- <step>
- <para>Install and configure the VisibilityController coprocessor by setting the
- following properties in <filename>hbase-site.xml</filename>. These properties take a
- list of class names.</para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.coprocessor.region.classes</name>
- <value>org.apache.hadoop.hbase.security.visibility.VisibilityController</value>
-</property>
-<property>
- <name>hbase.coprocessor.master.classes</name>
- <value>org.apache.hadoop.hbase.security.visibility.VisibilityController</value>
-</property>
- ]]></programlisting>
- <note>
- <para>If you use the AccessController and VisibilityController coprocessors together,
- the AccessController must come first in the list, because with both components
- active, the VisibilityController will delegate access control on its system tables
- to the AccessController.</para>
- </note>
- </step>
- <step>
- <title>Adjust Configuration</title>
- <para>By default, users can label cells with any label, including labels they are not
- associated with, which means that a user can Put data that he cannot read. For
- example, a user could label a cell with the (hypothetical) 'topsecret' label even if
- the user is not associated with that label. If you only want users to be able to label
- cells with labels they are associated with, set
- <property>hbase.security.visibility.mutations.checkauths</property> to
- <literal>true</literal>. In that case, the mutation will fail if it makes use of
- labels the user is not associated with.</para>
- </step>
- <step>
- <para>Distribute your configuration and restart your cluster for changes to take
- effect.</para>
- </step>
- </procedure>
- </section>
- <section>
- <title>Administration</title>
- <para>Administration tasks can be performed using the HBase Shell or the Java API. For
- defining the list of visibility labels and associating labels with users, the
- HBase Shell is probably simpler.</para>
- <caution>
- <title>API Examples</title>
- <para>Many of the Java API examples in this section are taken from the source file
- <filename>hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabels.java</filename>.
- Refer to that file or the API documentation for more context.</para>
- <para>Neither these examples, nor the source file they were taken from, are part of the
- public HBase API, and are provided for illustration only. Refer to the official API
- for usage instructions.</para>
- </caution>
- <procedure>
- <step>
- <title>Define the List of Visibility Labels</title>
- <example>
- <title>HBase Shell</title>
- <screen>hbase< <userinput>add_labels [ 'admin', 'service', 'developer', 'test' ]</userinput></screen>
- </example>
- <example>
- <title>Java API</title>
- <programlisting language="java"><![CDATA[
-public static void addLabels() throws Exception {
- PrivilegedExceptionAction<VisibilityLabelsResponse> action =
- new PrivilegedExceptionAction<VisibilityLabelsResponse>() {
- public VisibilityLabelsResponse run() throws Exception {
- String[] labels = { SECRET, TOPSECRET, CONFIDENTIAL, PUBLIC, PRIVATE, COPYRIGHT, ACCENT,
- UNICODE_VIS_TAG, UC1, UC2 };
- try {
- VisibilityClient.addLabels(conf, labels);
- } catch (Throwable t) {
- throw new IOException(t);
- }
- return null;
- }
- };
- SUPERUSER.runAs(action);
-}
- ]]></programlisting>
- </example>
- </step>
- <step>
- <title>Associate Labels with Users</title>
- <example>
- <title>HBase Shell</title>
- <screen>hbase< <userinput>set_auths 'service', [ 'service' ]</userinput></screen>
- <screen>hbase< <userinput>set_auths 'testuser', [ 'test' ]</userinput></screen>
- <screen>hbase< <userinput>set_auths 'qa', [ 'test', 'developer' ]</userinput></screen>
- </example>
- <example>
- <title>Java API</title>
- <programlisting language="java"><![CDATA[
-public void testSetAndGetUserAuths() throws Throwable {
- final String user = "user1";
- PrivilegedExceptionAction<Void> action = new PrivilegedExceptionAction<Void>() {
- public Void run() throws Exception {
- String[] auths = { SECRET, CONFIDENTIAL };
- try {
- VisibilityClient.setAuths(conf, auths, user);
- } catch (Throwable e) {
- }
- return null;
- }
- ...
- ]]></programlisting>
- </example>
- </step>
- <step>
- <title>Clear Labels From Users</title>
- <example>
- <title>HBase Shell</title>
- <screen>hbase< <userinput>clear_auths 'service', [ 'service' ]</userinput></screen>
- <screen>hbase< <userinput>clear_auths 'testuser', [ 'test' ]</userinput></screen>
- <screen>hbase< <userinput>clear_auths 'qa', [ 'test', 'developer' ]</userinput></screen>
- </example>
- <example>
- <title>Java API</title>
- <programlisting language="java"><![CDATA[
-...
-auths = new String[] { SECRET, PUBLIC, CONFIDENTIAL };
-VisibilityLabelsResponse response = null;
-try {
- response = VisibilityClient.clearAuths(conf, auths, user);
-} catch (Throwable e) {
- fail("Should not have failed");
-...
- ]]></programlisting>
- </example>
- </step>
- <step>
- <title>Apply a Label or Expression to a Cell</title>
- <para>The label is only applied when data is written. The label is associated with a
- given version of the cell.</para>
- <example>
- <title>HBase Shell</title>
- <screen>hbase< <userinput>set_visibility 'user', 'admin|service|developer', \
- { COLUMNS => 'i' }</userinput></screen>
- <screen>hbase< <userinput>set_visibility 'user', 'admin|service', \
- { COLUMNS => ' pii' }</userinput></screen>
- <screen>hbase< <userinput>COLUMNS => [ 'i', 'pii' ], \
- FILTER => "(PrefixFilter ('test'))" }</userinput></screen>
- </example>
- <note>
- <para>HBase Shell support for applying labels or permissions to cells is for testing
- and verification support, and should not be employed for production use because it
- won't apply the labels to cells that don't exist yet. The correct way to apply cell
- level labels is to do so in the application code when storing the values.</para>
- </note>
- <example>
- <title>Java API</title>
- <programlisting language="java"><![CDATA[
-static HTable createTableAndWriteDataWithLabels(TableName tableName, String... labelExps)
- throws Exception {
- HTable table = null;
- try {
- table = TEST_UTIL.createTable(tableName, fam);
- int i = 1;
- List<Put> puts = new ArrayList<Put>();
- for (String labelExp : labelExps) {
- Put put = new Put(Bytes.toBytes("row" + i));
- put.add(fam, qual, HConstants.LATEST_TIMESTAMP, value);
- put.setCellVisibility(new CellVisibility(labelExp));
- puts.add(put);
- i++;
- }
- table.put(puts);
- } finally {
- if (table != null) {
- table.flushCommits();
- }
- }
- ]]></programlisting>
- </example>
- </step>
- </procedure>
- </section>
- <section>
- <title>Implementing Your Own Visibility Label Algorithm</title>
- <para>Interpreting the labels authenticated for a given get/scan request is a pluggable
- algorithm. You can specify a custom plugin by using the property
- <code>hbase.regionserver.scan.visibility.label.generator.class</code>. The default
- implementation class is
- <code>org.apache.hadoop.hbase.security.visibility.DefaultScanLabelGenerator</code>. You
- can also configure a set of <code>ScanLabelGenerators</code> to be used by the system, as
- a comma-separated list.</para>
- </section>
- </section>
-
- <section xml:id="hbase.encryption.server">
- <title>Transparent Encryption of Data At Rest</title>
- <para>HBase provides a mechanism for protecting your data at rest, in HFiles and the WAL, which
- reside within HDFS or another distributed filesystem. A two-tier architecture is used for
- flexible and non-intrusive key rotation. "Transparent" means that no implementation changes
- are needed on the client side. When data is written, it is encrypted. When it is read, it is
- decrypted on demand.</para>
- <section>
- <title>How It Works</title>
- <para>The administrator provisions a master key for the cluster, which is stored in a key
- provider accessible to every trusted HBase process, including the HMaster, RegionServers,
- and clients (such as HBase Shell) on administrative workstations. The default key provider
- is integrated with the Java KeyStore API and any key management systems with support for
- it. Other custom key provider implementations are possible. The key retrieval mechanism is
- configured in the <filename>hbase-site.xml</filename> configuration file. The master key
- may be stored on the cluster servers, protected by a secure KeyStore file, or on an
- external keyserver, or in a hardware security module. This master key is resolved as
- needed by HBase processes through the configured key provider.</para>
- <para>Next, encryption use can be specified in the schema, per column family, by creating
- or modifying a column descriptor to include two additional attributes: the name of the
- encryption algorithm to use (currently only "AES" is supported), and optionally, a data
- key wrapped (encrypted) with the cluster master key. If a data key is not explictly
- configured for a ColumnFamily, HBase will create a random data key per HFile. This
- provides an incremental improvement in security over the alternative. Unless you need to
- supply an explicit data key, such as in a case where you are generating encrypted HFiles
- for bulk import with a given data key, only specify the encryption algorithm in the
- ColumnFamily schema metadata and let HBase create data keys on demand. Per Column Family
- keys facilitate low impact incremental key rotation and reduce the scope of any external
- leak of key material. The wrapped data key is stored in the ColumnFamily schema metadata,
- and in each HFile for the Column Family, encrypted with the cluster master key. After the
- Column Family is configured for encryption, any new HFiles will be written encrypted. To
- ensure encryption of all HFiles, trigger a major compaction after enabling this
- feature.</para>
- <para>When the HFile is opened, the data key is extracted from the HFile, decrypted with the
- cluster master key, and used for decryption of the remainder of the HFile. The HFile will
- be unreadable if the master key is not available. If a remote user somehow acquires access
- to the HFile data because of some lapse in HDFS permissions, or from inappropriately
- discarded media, it will not be possible to decrypt either the data key or the file
- data.</para>
- <para>It is also possible to encrypt the WAL. Even though WALs are transient, it is
- necessary to encrypt the WALEdits to avoid circumventing HFile protections for encrypted
- column families, in the event that the underlying filesystem is compromised. When WAL
- encryption is enabled, all WALs are encrypted, regardless of whether the relevant HFiles
- are encrypted.</para>
- </section>
- <section>
- <title>Server-Side Configuration</title>
- <para>This procedure assumes you are using the default Java keystore implementation. If you
- are using a custom implementation, check its documentation and adjust accordingly.</para>
- <procedure>
- <step>
- <title>Create a secret key of appropriate length for AES encryption, using the
- <code>keytool</code> utility.</title>
- <screen>$ <userinput>keytool -keystore /path/to/hbase/conf/hbase.jks \
- -storetype jceks -storepass **** \
- -genseckey -keyalg AES -keysize 128 \
- -alias <alias></userinput></screen>
- <para>Replace <replaceable>****</replaceable> with the password for the keystore file
- and <alias> with the username of the HBase service account, or an arbitrary
- string. If you use an arbitrary string, you will need to configure HBase to use it,
- and that is covered below. Specify a keysize that is appropriate. Do not specify a
- separate password for the key, but press <keycap>Return</keycap> when prompted.</para>
- </step>
- <step>
- <title>Set appropriate permissions on the keyfile and distribute it to all the HBase
- servers.</title>
- <para>The previous command created a file called <filename>hbase.jks</filename> in the
- HBase <filename>conf/</filename> directory. Set the permissions and ownership on this
- file such that only the HBase service account user can read the file, and securely
- distribute the key to all HBase servers.</para>
- </step>
- <step>
- <title>Configure the HBase daemons.</title>
- <para>Set the following properties in <filename>hbase-site.xml</filename> on the region
- servers, to configure HBase daemons to use a key provider backed by the KeyStore file
- or retrieving the cluster master key. In the example below, replace
- <replaceable>****</replaceable> with the password.</para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.crypto.keyprovider</name>
- <value>org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider</value>
-</property>
-<property>
- <name>hbase.crypto.keyprovider.parameters</name>
- <value>jceks:///path/to/hbase/conf/hbase.jks?password=****</value>
-</property>
- ]]></programlisting>
- <para>By default, the HBase service account name will be used to resolve the cluster
- master key. However, you can store it with an arbitrary alias (in the
- <command>keytool</command> command). In that case, set the following property to the
- alias you used.</para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.crypto.master.key.name</name>
- <value>my-alias</value>
-</property>]]>
- </programlisting>
- <para>You also need to be sure your HFiles use HFile v3, in order to use transparent
- encryption. This is the default configuration for HBase 1.0 onward. For previous
- versions, set the following property in your <filename>hbase-site.xml</filename>
- file.</para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hfile.format.version</name>
- <value>3</value>
-</property>]]>
- </programlisting>
- <para>Optionally, you can use a different cipher provider, either a Java Cryptography
- Encryption (JCE) algorithm provider or a custom HBase cipher implementation. </para>
- <substeps>
- <step>
- <title>JCE: </title>
- <itemizedlist>
- <listitem>
- <para>Install a signed JCE provider (supporting “AES/CTR/NoPadding” mode with
- 128 bit keys) </para>
- </listitem>
- <listitem>
- <para>Add it with highest preference to the JCE site configuration file
- <filename>$JAVA_HOME/lib/security/java.security</filename>.</para>
- </listitem>
- <listitem>
- <para>Update <option>hbase.crypto.algorithm.aes.provider</option> and
- <option>hbase.crypto.algorithm.rng.provider</option> options in
- <filename>hbase-site.xml</filename>. </para>
- </listitem>
- </itemizedlist>
- </step>
- <step>
- <title>Custom HBase Cipher: </title>
- <itemizedlist>
- <listitem>
- <para>Implement
- <code>org.apache.hadoop.hbase.io.crypto.CipherProvider</code>.</para>
- </listitem>
- <listitem>
- <para>Add the implementation to the server classpath.</para>
- </listitem>
- <listitem>
- <para>Update <option>hbase.crypto.cipherprovider</option> in
- <filename>hbase-site.xml</filename>.</para>
- </listitem>
- </itemizedlist>
- </step>
- </substeps>
- </step>
- <step>
- <title>Configure WAL encryption.</title>
- <para>Configure WAL encryption in every RegionServer's
- <filename>hbase-site.xml</filename>, by setting the following properties. You can
- include these in the HMaster's <filename>hbase-site.xml</filename> as well, but the
- HMaster does not have a WAL and will not use them.</para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.regionserver.hlog.reader.impl</name>
- <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader</value>
-</property>
-<property>
- <name>hbase.regionserver.hlog.writer.impl</name>
- <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter</value>
-</property>
-<property>
- <name>hbase.regionserver.wal.encryption</name>
- <value>true</value>
-</property>
- ]]></programlisting>
- </step>
- <step>
- <title>Configure permissions on the <filename>hbase-site.xml</filename> file.</title>
- <para>Because the keystore password is stored in the hbase-site.xml, you need to ensure
- that only the HBase user can read the <filename>hbase-site.xml</filename> file, using
- file ownership and permissions.</para>
- </step>
- <step>
- <title>Restart your cluster.</title>
- <para>Distribute the new configuration file to all nodes and restart your
- cluster.</para>
- </step>
- </procedure>
- </section>
- <section>
- <title>Administration</title>
- <para>Administrative tasks can be performed in HBase Shell or the Java API.</para>
- <caution>
- <title>Java API</title>
- <para>Java API examples in this section are taken from the source file
- <filename>hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsckEncryption.java</filename>.
- .</para>
- <para>Neither these examples, nor the source files they are taken from, are part of the
- public HBase API, and are provided for illustration only. Refer to the official API
- for usage instructions.</para>
- </caution>
- <variablelist>
- <varlistentry>
- <term>Enable Encryption on a Column Family</term>
- <listitem>
- <para>To enable encryption on a column family, you can either use HBase Shell or the
- Java API. After enabling encryption, trigger a major compaction. When the major
- compaction completes, the HFiles will be encrypted.</para>
- <example>
- <title>HBase Shell</title>
- <screen>
-hbase> disable 'mytable'
-hbase> alter 'mytable', 'mycf', {ENCRYPTION => AES}
-hbase> enable 'mytable'
- </screen>
- </example>
- <example>
- <title>Java API</title>
- <para>You can use the <code>HBaseAdmin#modifyColumn</code> API to modify the
- <property>ENCRYPTION</property> attribute on a Column Family. Additionally, you
- can specify the specific key to use as the wrapper, by setting the
- <property>ENCRYPTION_KEY</property> attribute. This is only possible via the
- Java API, and not the HBase Shell. The default behavior if you do not specify an
- <property>ENCRYPTION_KEY</property> for a column family is for a random key to
- be generated for each encrypted column family (per HFile). This provides
- additional defense in the (unlikely, but theoretically possible) occurrence of
- storing the same data in multiple HFiles with exactly the same block layout, the
- same data key, and the same randomly-generated initialization vector.</para>
- <para>This example shows how to programmatically set the transparent encryption both
- in the server configuration and at the column family, as part of a test which uses
- the Minicluster configuration.</para>
- <programlisting language="java">
-@Before
-public void setUp() throws Exception {
- conf = TEST_UTIL.getConfiguration();
- conf.setInt("hfile.format.version", 3);
- conf.set(HConstants.CRYPTO_KEYPROVIDER_CONF_KEY, KeyProviderForTesting.class.getName());
- conf.set(HConstants.CRYPTO_MASTERKEY_NAME_CONF_KEY, "hbase");
-
- // Create the test encryption key
- SecureRandom rng = new SecureRandom();
- byte[] keyBytes = new byte[AES.KEY_LENGTH];
- rng.nextBytes(keyBytes);
- cfKey = new SecretKeySpec(keyBytes, "AES");
-
- // Start the minicluster
- TEST_UTIL.startMiniCluster(3);
-
- // Create the table
- htd = new HTableDescriptor(TableName.valueOf("default", "TestHBaseFsckEncryption"));
- HColumnDescriptor hcd = new HColumnDescriptor("cf");
- hcd.setEncryptionType("AES");
- hcd.setEncryptionKey(EncryptionUtil.wrapKey(conf,
- conf.get(HConstants.CRYPTO_MASTERKEY_NAME_CONF_KEY, User.getCurrent().getShortName()),
- cfKey));
- htd.addFamily(hcd);
- TEST_UTIL.getHBaseAdmin().createTable(htd);
- TEST_UTIL.waitTableAvailable(htd.getName(), 5000);
-}
- </programlisting>
- </example>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Rotate the Data Key</term>
- <listitem>
- <para>To rotate the data key, first change the ColumnFamily key in the column
- descriptor, then trigger a major compaction. When compaction is complete, all HFiles
- will be re-encrypted using the new data key. Until the compaction completes, the
- old HFiles will still be readable using the old key.</para>
- <para>If you rely on HBase's default behavior of generating a random key for each
- HFile, there is no need to rotate data keys. A major compaction will re-encrypt the
- HFile with a new key.</para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Switching Between Using a Random Data Key and Specifying A Key</term>
- <listitem>
- <para>If you configured a column family to use a specific key and you want to return
- to the default behavior of using a randomly-generated key for that column family,
- use the Java API to alter the <code>HColumnDescriptor</code> so that no value is
- sent with the key <literal>EN
<TRUNCATED>