You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by mi...@apache.org on 2014/10/07 09:22:17 UTC
[1/2] HBASE-11791 Update docs on visibility tags and ACLs,
transparent encryption, secure bulk upload
Repository: hbase
Updated Branches:
refs/heads/master 989c6262f -> 38bc5360c
http://git-wip-us.apache.org/repos/asf/hbase/blob/38bc5360/src/main/docbkx/security.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/security.xml b/src/main/docbkx/security.xml
index bb4ae17..9a76c8b 100644
--- a/src/main/docbkx/security.xml
+++ b/src/main/docbkx/security.xml
@@ -467,950 +467,1277 @@ grant 'rest_server', 'RWCA'
</section>
<!-- Simple User Access to Apache HBase -->
- <section
- xml:id="hbase.tags">
- <title>Tags</title>
- <para> Every cell can have metadata associated with it. Adding metadata in the data part of
- every cell would make things difficult. </para>
- <para> The 0.98 version of HBase solves this problem by providing Tags along with the cell
- format. Some of the usecases that uses the tags are Visibility labels, Cell level ACLs, etc. </para>
- <para> HFile V3 version from 0.98 onwards supports tags and this feature can be turned on using
- the following configuration </para>
- <programlisting language="xml"><![CDATA[
+ <section>
+ <title>Securing Access To Your Data</title>
+ <para>After you have configured secure authentication between HBase client and server processes
+ and gateways, you need to consider the security of your data itself. HBase provides several
+ strategies for securing your data:</para>
+ <itemizedlist>
+ <listitem>
+ <para>Role-based Access Control (RBAC) controls which users or groups can read and write to
+ a given HBase resource or execute a coprocessor endpoint, using the familiar paradigm of
+ roles.</para>
+ </listitem>
+ <listitem>
+ <para>Visibility Labels which allow you to label cells and control access to labelled cells,
+ to further restrict who can read or write to certain subsets of your data. Visibility
+ labels are stored as tags. See <xref linkend="hbase.tags"/> for more information.</para>
+ </listitem>
+ <listitem>
+ <para>Transparent encryption of data at rest on the underlying filesystem, both in HFiles
+ and in the WAL. This protects your data at rest from an attacker who has access to the
+ underlying filesystem, without the need to change the implementation of the client. It can
+ also protect against data leakage from improperly disposed disks, which can be important
+ for legal and regulatory compliance.</para>
+ </listitem>
+ </itemizedlist>
+ <para>Server-side configuration, administration, and implementation details of each of these
+ features are discussed below, along with any performance trade-offs. An example security
+ configuration is given at the end, to show these features all used together, as they might be
+ in a real-world scenario.</para>
+ <caution>
+ <para>All aspects of security in HBase are in active development and evolving rapidly. Any
+ strategy you employ for security of your data should be thoroughly tested. In addition, some
+ of these features are still in the experimental stage of development. To take advantage of
+ many of these features, you must be running HBase 0.98+ and using the HFile v3 file
+ format.</para>
+ </caution>
+
+ <warning>
+ <title>Protecting Sensitive Files</title>
+ <para>Several procedures in this section require you to copy files between cluster nodes. When
+ copying keys, configuration files, or other files containing sensitive strings, use a secure
+ method, such as <code>ssh</code>, to avoid leaking sensitive data.</para>
+ </warning>
+
+ <procedure xml:id="security.data.basic.server.side">
+ <title>Basic Server-Side Configuration</title>
+ <step>
+ <para>Enable HFile v3, by setting <option>hfile.format.version </option>to 3 in
+ <filename>hbase-site.xml</filename>. This is the default for HBase 1.0 and
+ newer.</para>
+ <programlisting language="xml"><![CDATA[
<property>
<name>hfile.format.version</name>
<value>3</value>
</property>
- ]]></programlisting>
- <para> Every cell can have zero or more tags. Every tag has a type and the actual tag byte
- array. The types <command>0-31</command> are reserved for System tags. For example ‘1’ is
- reserved for ACL and ‘2’ is reserved for Visibility tags. </para>
- <para> The way rowkeys, column families, qualifiers and values are encoded using different
- Encoding Algos, similarly the tags can also be encoded. Tag encoding can be turned on per CF.
- Default is always turn ON. To turn on the tag encoding on the HFiles use </para>
- <programlisting language="java"><![CDATA[
-HColumnDescriptor#setCompressTags(boolean compressTags)
- ]]></programlisting>
- <para> Note that encoding of tags takes place only if the DataBlockEncoder is enabled for the
- CF. </para>
- <para> As we compress the WAL entries using Dictionary the tags present in the WAL can also be
- compressed using Dictionary. Every tag is compressed individually using WAL Dictionary. To
- turn ON tag compression in WAL dictionary enable the property </para>
- <programlisting language="xml"><![CDATA[
-<property>
- <name>hbase.regionserver.wal.tags.enablecompression</name>
+ ]]></programlisting>
+ </step>
+ <step>
+ <para>Enable SASL and Kerberos authentication for RPC and ZooKeeper, as described in <xref
+ linkend="security.prerequisites"/> and <xref linkend="zk.sasl.auth"/>.</para>
+ </step>
+ </procedure>
+
+ <section xml:id="hbase.tags">
+ <title>Tags</title>
+ <para><firstterm>Tags</firstterm> are a feature of HFile v3. A tag is a piece of metadata
+ which is part of a cell, separate from the key, value, and version. Tags are an
+ implementation detail which provides a foundation for other security-related features such
+ as cell-level ACLs and visibility labels. Tags are stored in the HFiles themselves. It is
+ possible that in the future, tags will be used to implement other HBase features. You don't
+ need to know a lot about tags in order to use the security features they enable.</para>
+ <section>
+ <title>Implementation Details</title>
+ <para> Every cell can have zero or more tags. Every tag has a type and the actual tag byte
+ array.</para>
+ <para> Just as row keys, column families, qualifiers and values can be encoded (see <xref
+ linkend="data.block.encoding.types"/>), tags can also be encoded as well. You can enable
+ or disable tag encoding at the level of the column family, and it is enabled by default.
+ Use the <code>HColumnDescriptor#setCompressionTags(boolean compressTags)</code> method to
+ manage encoding settings on a column family. You also need to enable the DataBlockEncoder
+ for the column family, for encoding of tags to take effect.</para>
+ <para>You can enable compression of each tag in the WAL, if WAL compression is also enabled,
+ by setting the value of <option>hbase.regionserver.wal.tags.enablecompression</option> to
+ <literal>true</literal> in <filename>hbase-site.xml</filename>. Tag compression uses
+ dictionary encoding.</para>
+ <para>Tag compression is not supported when using WAL encryption.</para>
+ </section>
+ </section>
+
+ <section xml:id="hbase.accesscontrol.configuration">
+ <title>Access Control Labels (ACLs)</title>
+ <section>
+ <title>How It Works</title>
+ <para>ACLs in HBase are based upon a user's membership in or exclusion from groups, and a
+ given group's permissions to access a given resource. ACLs are implemented as a
+ coprocessor called AccessController.</para>
+ <para>HBase does not maintain a private group mapping, but relies on a <firstterm>Hadoop
+ group mapper</firstterm>, which maps between entities in a directory such as LDAP or
+ Active Directory, and HBase users. Any supported Hadoop group mapper will work. Users are
+ then granted specific permissions (Read, Write, Execute, Create, Admin) against resources
+ (global, namespaces, tables, cells, or endpoints).</para>
+ <note>
+ <para> With Kerberos and Access Control enabled, client access to HBase is authenticated
+ and user data is private unless access has been explicitly granted.</para>
+ </note>
+ <para>HBase has a simpler security model than relational databases, especially in terms of
+ client operations. No distinction is made between an insert (new record) and update (of
+ existing record), for example, as both collapse down into a Put. Accordingly, the
+ important operations condense to four permissions: READ, WRITE, CREATE, and ADMIN.</para>
+ <table>
+ <title>Operation To Permission Mapping</title>
+ <tgroup cols="2" align="left" colsep="1" rowsep="1">
+ <colspec colname="c1" align="center"/>
+ <colspec colname="c2" align="left"/>
+ <thead>
+ <row>
+ <entry>Permission</entry>
+ <entry>Operation</entry>
+ </row>
+ </thead>
+ <tbody>
+ <!-- READ -->
+ <row>
+ <entry>Read</entry>
+ <entry>Get</entry>
+ </row>
+ <row>
+ <entry/>
+ <entry>Exists</entry>
+ </row>
+ <row>
+ <entry/>
+ <entry>Scan</entry>
+ </row>
+ <!-- WRITE -->
+ <row>
+ <entry>Write</entry>
+ <entry>Put</entry>
+ </row>
+ <row>
+ <entry/>
+ <entry>Delete</entry>
+ </row>
+ <row>
+ <entry/>
+ <entry>IncrementColumnValue</entry>
+ </row>
+ <row>
+ <entry/>
+ <entry>CheckAndDelete/Put</entry>
+ </row>
+ <!-- CREATE -->
+ <row>
+ <entry>Create</entry>
+ <entry>Create</entry>
+ </row>
+ <row>
+ <entry/>
+ <entry>Alter</entry>
+ </row>
+ <row>
+ <entry/>
+ <entry>Drop</entry>
+ </row>
+ <row>
+ <entry/>
+ <entry>Bulk Load</entry>
+ </row>
+ <!-- ADMIN -->
+ <row>
+ <entry>Admin</entry>
+ <entry>Enable/Disable</entry>
+ </row>
+ <row>
+ <entry/>
+ <entry>Snapshot/Restore/Clone</entry>
+ </row>
+ <row>
+ <entry/>
+ <entry>Split</entry>
+ </row>
+ <row>
+ <entry/>
+ <entry>Flush</entry>
+ </row>
+ <row>
+ <entry/>
+ <entry>Compact</entry>
+ </row>
+ <row>
+ <entry/>
+ <entry>Major Compact</entry>
+ </row>
+ <row>
+ <entry />
+ <entry>Roll HLog</entry>
+ </row>
+ <row>
+ <entry/>
+ <entry>Grant</entry>
+ </row>
+ <row>
+ <entry/>
+ <entry>Revoke</entry>
+ </row>
+ <row>
+ <entry/>
+ <entry>Shutdown</entry>
+ </row>
+ <row>
+ <entry>Execute</entry>
+ <entry>Execute coprocessor endpoints</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ <para> Permissions can be granted in any of the following scopes, though CREATE and ADMIN
+ permissions are effective only at table, namespace, and global scopes. </para>
+ <variablelist>
+ <varlistentry>
+ <term>Namespace</term>
+ <listitem>
+ <itemizedlist>
+ <listitem>
+ <para>Read: User can read any table in the namespace.</para>
+ </listitem>
+ <listitem>
+ <para>Write: User can write to any table in the namespace.</para>
+ </listitem>
+ <listitem>
+ <para>Create: User can create tables in the namespace.</para>
+ </listitem>
+ <listitem>
+ <para>Admin: User can alter table attributes; add, alter, or drop column families;
+ and enable, disable, or drop the table. User can also trigger region
+ (re)assignments or relocation.</para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Table</term>
+ <listitem>
+ <itemizedlist>
+ <listitem>
+ <para>Read: User can read from any column family in table</para>
+ </listitem>
+ <listitem>
+ <para>Write: User can write to any column family in table</para>
+ </listitem>
+ <listitem>
+ <para>Create: User can alter table attributes; add, alter, or drop column
+ families; and drop the table.</para>
+ </listitem>
+ <listitem>
+ <para>Admin: User can alter table attributes; add, alter, or drop column families;
+ and enable, disable, or drop the table. User can also trigger region
+ (re)assignments or relocation.</para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Column Family / Column Qualifier / Cell</term>
+ <listitem>
+ <itemizedlist>
+ <listitem>
+ <para>Read: User can read at the specified scope.</para>
+ </listitem>
+ <listitem>
+ <para>Write: User can write at the specified scope.</para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Coprocessor Endpoint</term>
+ <listitem>
+ <para>Execute: the user can execute the coprocessor endpoint.</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Global</term>
+ <listitem>
+ <para>Superusers are specified as a comma-separated list of users and groups, in the
+ <option>hbase.superuser</option> option in <filename>hbase-site.xml</filename>.
+ The superuser is equivalent to the <literal>root</literal> user in a UNIX
+ environment. As a minimum, the superuser should include the principal used to run
+ the HMaster process. Global admin privileges, which are implicitly granted to the
+ superuser, are required to create namespaces, switch the balancer on and off, or
+ take other actions with global consequences. The superuser can also grant all
+ permissions to all resources.</para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ <formalpara>
+ <title>ACL Matrix</title>
+ <para>For more details on how ACLs map to specific HBase operations and tasks, see <xref
+ linkend="appendix_acl_matrix"/>.</para>
+ </formalpara>
+ <para>Cell-level ACLs are implemented using tags (see <xref linkend="hbase.tags"/>). In
+ order to use cell-level ACLs, you must be using HFile v3 and HBase 0.98 or newer.</para>
+ <orderedlist>
+ <title>ACL Implementation Caveats</title>
+ <listitem>
+ <para>Files created by HBase are owned by the operating system user running the HBase
+ process. To interact with HBase files, you should use the API or bulk load
+ facility.</para>
+ </listitem>
+ <listitem>
+ <para>HBase does not model "roles" internally in HBase. Instead, group names can be
+ granted permissions. This allows external modeling of roles via group membership.
+ Groups are created and manipulated externally to HBase, via the Hadoop group mapping
+ service.</para>
+ </listitem>
+ </orderedlist>
+ </section>
+ <section>
+ <title>Server-Side Configuration</title>
+ <procedure>
+ <step>
+ <para>As a prerequisite, perform the steps in <xref
+ linkend="security.data.basic.server.side"/>.</para></step>
+ <step>
+ <para>Install and configure the AccessController coprocessor, by setting the following
+ properties in <filename>hbase-site.xml</filename>. These properties take a list of
+ classes. </para>
+ <note>
+ <para>If you use the AccessController along with the VisibilityController, the
+ AccessController must come first in the list, because with both components active,
+ the VisibilityController will delegate access control on its system tables to the
+ AccessController. For an example of using both together, see <xref
+ linkend="security.example.config"/>.</para></note>
+ <programlisting language="xml"><![CDATA[
+<property>
+ <name>hbase.coprocessor.region.classes</name>
+ <value>org.apache.hadoop.hbase.security.access.AccessController, org.apache.hadoop.hbase.security.token.TokenProvider</value>
+</property>
+<property>
+ <name>hbase.coprocessor.master.classes</name>
+ <value>org.apache.hadoop.hbase.security.access.AccessController</value>
+</property>
+<property>
+ <name>hbase.coprocessor.regionserver.classes</name>
+ <value>org.apache.hadoop.hbase.security.access.AccessController</value>
+</property>
+<property>
+ <name>hbase.security.exec.permission.checks</name>
<value>true</value>
</property>
- ]]></programlisting>
- <para> To add tags to every cell during Puts, the following apis are provided </para>
- <programlisting language="java"><![CDATA[
-Put#add(byte[] family, byte [] qualifier, byte [] value, Tag[] tag)
-Put#add(byte[] family, byte[] qualifier, long ts, byte[] value, Tag[] tag)
- ]]></programlisting>
+ ]]></programlisting>
+ <para>Optionally, you can enable transport security, by setting
+ <option>hbase.rpc.protection</option> to <literal>auth-conf</literal>. This requires
+ HBase 0.98.4 or newer.</para>
+ </step>
+ <step>
+ <para>Set up the Hadoop group mapper in the Hadoop namenode's
+ <filename>core-site.xml</filename>. This is a Hadoop file, not an HBase file.
+ Customize it to your site's needs. Following is an example.</para>
+ <programlisting language="xml"><![CDATA[
+<property>
+ <name>hadoop.security.group.mapping</name>
+ <value>org.apache.hadoop.security.LdapGroupsMapping</value>
+</property>
- <para> Some of the feature developed using tags are Cell level ACLs and Visibility labels. These
- are some features that use tags framework and allows users to gain better security features on
- cell level. </para>
- <para> For details, see:</para>
- <para>
- <link
- linkend="hbase.accesscontrol.configuration">Access Control</link>
- <link
- linkend="hbase.visibility.labels">Visibility labels</link>
- </para>
- </section>
+<property>
+ <name>hadoop.security.group.mapping.ldap.url</name>
+ <value>ldap://server</value>
+</property>
- <section
- xml:id="hbase.accesscontrol.configuration">
- <title>Access Control</title>
- <para> Newer releases of Apache HBase (>= 0.92) support optional access control list (ACL-)
- based protection of resources on a column family and/or table basis. </para>
- <para> This describes how to set up Secure HBase for access control, with an example of granting
- and revoking user permission on table resources provided. </para>
+<property>
+ <name>hadoop.security.group.mapping.ldap.bind.user</name>
+ <value>Administrator@example-ad.local</value>
+</property>
- <section>
- <title>Prerequisites</title>
- <para> You must configure HBase for secure or simple user access operation. Refer to the <link
- linkend="hbase.accesscontrol.configuration">Secure Client Access to HBase</link> or <link
- linkend="hbase.secure.simpleconfiguration">Simple User Access to HBase</link> sections and
- complete all of the steps described there. </para>
- <para> For secure access, you must also configure ZooKeeper for secure operation. Changes to
- ACLs are synchronized throughout the cluster using ZooKeeper. Secure authentication to
- ZooKeeper must be enabled or otherwise it will be possible to subvert HBase access control
- via direct client access to ZooKeeper. Refer to the section on secure ZooKeeper
- configuration and complete all of the steps described there. </para>
+<property>
+ <name>hadoop.security.group.mapping.ldap.bind.password</name>
+ <value>****</value>
+</property>
+
+<property>
+ <name>hadoop.security.group.mapping.ldap.base</name>
+ <value>dc=example-ad,dc=local</value>
+</property>
+
+<property>
+ <name>hadoop.security.group.mapping.ldap.search.filter.user</name>
+ <value>(&(objectClass=user)(sAMAccountName={0}))</value>
+</property>
+
+<property>
+ <name>hadoop.security.group.mapping.ldap.search.filter.group</name>
+ <value>(objectClass=group)</value>
+</property>
+
+<property>
+ <name>hadoop.security.group.mapping.ldap.search.attr.member</name>
+ <value>member</value>
+</property>
+
+<property>
+ <name>hadoop.security.group.mapping.ldap.search.attr.group.name</name>
+ <value>cn</value>
+</property>]]>
+ </programlisting>
+ </step>
+ <step>
+ <para>Optionally, enable the early-out evaluation strategy. Prior to HBase 0.98.0, if a
+ user was not granted access to a column family, or at least a column qualifier, an
+ AccessDeniedException would be thrown. HBase 0.98.0 removed this exception in order to
+ allow cell-level exceptional grants. To restore the old behavior in HBase
+ 0.98.0-0.98.6, set <option>hbase.security.access.early_out</option> to
+ <literal>true</literal> in <filename>hbase-site.xml</filename>. In HBase 0.98.6, the
+ default has been returned to <literal>true</literal>.</para>
+ </step>
+ <step>
+ <para>Distribute your configuration and restart your cluster for changes to take
+ effect.</para>
+ </step>
+ <step>
+ <para>To test your configuration, log into HBase Shell as a given user and use the
+ <command>whoami</command> command to report the groups your user is part of. In this
+ example, the user is reported as being a member of the <code>services</code>
+ group.</para>
+ <screen>
+hbase> <userinput>whoami</userinput>
+<computeroutput>service (auth:KERBEROS)
+ groups: services</computeroutput>
+ </screen>
+ </step>
+ </procedure>
+ </section>
+ <section>
+ <title>Administration</title>
+ <para>Administration tasks can be performed from HBase Shell or via an API.</para>
+ <caution>
+ <title>API Examples</title>
+ <para>Many of the API examples below are taken from source files
+ <filename>hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java</filename>
+ and
+ <filename>hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/SecureTestUtil.java</filename>.</para>
+ <para>Neither the examples, nor the source files they are taken from, are part of the
+ public HBase API, and are provided for illustration only. Refer to the
+ official API for usage instructions.</para>
+ </caution>
+ <procedure>
+ <step>
+ <title>User and Group Administration</title>
+ <para>Users and groups are maintained external to HBase, in your directory.</para>
+ </step>
+ <step>
+ <title>Granting Access To A Namespace, Table, Column Family, or Cell</title>
+ <para>There are a few different types of syntax for grant statements. The first, and
+ most familiar, is as follows, with the table and column family being optional:</para>
+ <screen>grant 'user', 'RWXCA', 'TABLE', 'CF', 'CQ'</screen>
+ <para>Groups and users are granted access in the same way, but groups are prefixed with
+ an <literal>@</literal> symbol. In the same way, tables and namespaces are specified
+ in the same way, but namespaces are prefixed with an <literal>@</literal>
+ symbol.</para>
+ <para>It is also possible to grant multiple permissions against the same resource in a
+ single statement, as in this example. The first sub-clause maps users to
+ ACLs and the second sub-clause specifies the resource.</para>
+ <note>
+ <para>HBase Shell support for granting and revoking access at the cell level is for
+ testing and verification support, and should not be employed for production use
+ because it won't apply the permissions to cells that don't exist yet. The correct
+ way to apply cell level permissions is to do so in the application code when storing
+ the values.</para>
+ </note>
+ <formalpara>
+ <title>ACL Granularity and Evaluation Order</title>
+ <para>ACLs are evaluated from least granular to most granular, and when an ACL is
+ reached that grants permission, evaluation stops. This means that cell ACLs do not
+ override ACLs at less granularity.</para>
+ </formalpara>
+ <example>
+ <title>HBase Shell</title>
+ <itemizedlist>
+ <listitem>
+ <para>Global:</para>
+ <screen>hbase> <userinput>grant '@admins', 'RWXCA'</userinput></screen>
+ </listitem>
+ <listitem>
+ <para>Namespace:</para>
+ <screen>hbase> <userinput>grant 'service', 'RWXCA', '@test-NS'</userinput></screen>
+ </listitem>
+ <listitem>
+ <para>Table:</para>
+ <screen>hbase> <userinput>grant 'service', 'RWXCA', 'user'</userinput></screen>
+ </listitem>
+ <listitem>
+ <para>Column Family:</para>
+ <screen>hbase> <userinput>grant '@developers', 'RW', 'user', 'i'</userinput></screen>
+ </listitem>
+ <listitem>
+ <para>Column Qualifier:</para>
+ <screen>hbase> <userinput>grant 'service, 'RW', 'user', 'i', 'foo'</userinput></screen>
+ </listitem>
+ <listitem>
+ <para>Cell:</para>
+ <para>The syntax for granting cell ACLs uses the following syntax:</para>
+ <screen>grant <replaceable><table></replaceable>, \
+ { '<replaceable><user-or-group></replaceable>' => \
+ '<replaceable><permissions></replaceable>', ... }, \
+ { <replaceable><scanner-specification></replaceable> }</screen>
+ <itemizedlist>
+ <listitem>
+ <para><replaceable><user-or-group></replaceable> is the user or group
+ name, prefixed with <literal>@</literal> in the case of a group.</para>
+ </listitem>
+ <listitem>
+ <para><replaceable><permissions></replaceable> is a string containing
+ any or all of "RWXCA", though only R and W are meaningful at cell
+ scope.</para>
+ </listitem>
+ <listitem>
+ <para><replaceable><scanner-specification></replaceable> is the scanner
+ specification syntax and conventions used by the 'scan' shell command. For
+ some examples of scanner specifications, issue the following HBase Shell
+ command.</para>
+ <screen>hbase> help "scan"</screen>
+ </listitem>
+ </itemizedlist>
+ <para>This example grants read access to the 'testuser' user and read/write access
+ to the 'developers' group, on cells in the 'pii' column which match the
+ filter.</para>
+ <screen>hbase> grant 'user', \
+ { '@developers' => 'RW', 'testuser' => 'R' }, \
+ { COLUMNS => 'pii', FILTER => "(PrefixFilter ('test'))" }</screen>
+ <para>The shell will run a scanner with the given criteria, rewrite the found
+ cells with new ACLs, and store them back to their exact coordinates.</para>
+ </listitem>
+ </itemizedlist>
+ </example>
+ <example>
+ <title>API</title>
+ <para>The following example shows how to grant access at the
+ table level.</para>
+ <programlisting language="java"><![CDATA[
+public static void grantOnTable(final HBaseTestingUtility util, final String user,
+ final TableName table, final byte[] family, final byte[] qualifier,
+ final Permission.Action... actions) throws Exception {
+ SecureTestUtil.updateACLs(util, new Callable<Void>() {
+ @Override
+ public Void call() throws Exception {
+ HTable acl = new HTable(util.getConfiguration(), AccessControlLists.ACL_TABLE_NAME);
+ try {
+ BlockingRpcChannel service = acl.coprocessorService(HConstants.EMPTY_START_ROW);
+ AccessControlService.BlockingInterface protocol =
+ AccessControlService.newBlockingStub(service);
+ ProtobufUtil.grant(protocol, user, table, family, qualifier, actions);
+ } finally {
+ acl.close();
+ }
+ return null;
+ }
+ });
+} ]]>
+ </programlisting>
+ <para>To grant permissions at the cell level, you can use the
+ <code>Mutation.setACL</code> method:</para>
+ <programlisting language="java"><![CDATA[
+Mutation.setACL(String user, Permission perms)
+Mutation.setACL(Map<String, Permission> perms)
+ ]]>
+ </programlisting>
+ <para>Specifically, this example provides read permission to a user called
+ <literal>user1</literal> on any cells contained in a particular Put
+ operation:</para>
+ <programlisting language="java"><![CDATA[
+put.setACL(“user1”, new Permission(Permission.Action.READ))
+ ]]></programlisting>
+ </example>
+ </step>
+ <step>
+ <title>Revoking Access Control From a Namespace, Table, Column Family, or Cell</title>
+ <para>The <command>revoke</command> command and API are twins of the grant command and
+ API, and the syntax is exactly the same. The only exception is that you cannot revoke
+ permissions at the cell level. You can only revoke access that has previously been
+ granted, and a <command>revoke</command> statement is not the same thing as explicit
+ denial to a resource.</para>
+ <note>
+ <para>HBase Shell support for granting and revoking access is for testing and verification
+ support, and should not be employed for production use because it won't apply the
+ permissions to cells that don't exist yet. The correct way to apply cell-level
+ permissions is to do so in the application code when storing the values.</para>
+ </note>
+ <example>
+ <title>Revoking Access To a Table</title>
+ <programlisting language="java">
+<![CDATA[public static void revokeFromTable(final HBaseTestingUtility util, final String user,
+ final TableName table, final byte[] family, final byte[] qualifier,
+ final Permission.Action... actions) throws Exception {
+ SecureTestUtil.updateACLs(util, new Callable<Void>() {
+ @Override
+ public Void call() throws Exception {
+ HTable acl = new HTable(util.getConfiguration(), AccessControlLists.ACL_TABLE_NAME);
+ try {
+ BlockingRpcChannel service = acl.coprocessorService(HConstants.EMPTY_START_ROW);
+ AccessControlService.BlockingInterface protocol =
+ AccessControlService.newBlockingStub(service);
+ ProtobufUtil.revoke(protocol, user, table, family, qualifier, actions);
+ } finally {
+ acl.close();
+ }
+ return null;
+ }
+ });
+} ]]>
+ </programlisting>
+ </example>
+ </step>
+ <step>
+ <title>Showing a User's Effective Permissions</title>
+ <example>
+ <title>HBase Shell</title>
+ <screen>hbase> user_permission 'user'</screen>
+ <screen>hbase> user_permission '.*'</screen>
+ <screen>hbase> user_permission <replaceable>JAVA_REGEX</replaceable></screen>
+ </example>
+ <example>
+ <title>API</title>
+ <programlisting language="java"><![CDATA[
+public static void verifyAllowed(User user, AccessTestAction action, int count) throws Exception {
+ try {
+ Object obj = user.runAs(action);
+ if (obj != null && obj instanceof List<?>) {
+ List<?> results = (List<?>) obj;
+ if (results != null && results.isEmpty()) {
+ fail("Empty non null results from action for user '" + user.getShortName() + "'");
+ }
+ assertEquals(count, results.size());
+ }
+ } catch (AccessDeniedException ade) {
+ fail("Expected action to pass for user '" + user.getShortName() + "' but was denied");
+ }
+}]]>
+ </programlisting>
+ </example>
+ </step>
+ </procedure>
+ </section>
</section>
<section>
- <title>Overview</title>
- <para> With Secure RPC and Access Control enabled, client access to HBase is authenticated and
- user data is private unless access has been explicitly granted. Access to data can be
- granted at a table or per column family basis. </para>
- <para> However, the following items have been left out of the initial implementation for
- simplicity: </para>
- <orderedlist>
- <listitem>
- <para>Row-level or per value (cell): Using Tags in HFile V3</para>
- </listitem>
- <listitem>
- <para>Push down of file ownership to HDFS: HBase is not designed for the case where files
- may have different permissions than the HBase system principal. Pushing file ownership
- down into HDFS would necessitate changes to core code. Also, while HDFS file ownership
- would make applying quotas easy, and possibly make bulk imports more straightforward, it
- is not clear that it would offer a more secure setup.</para>
- </listitem>
- <listitem>
- <para>HBase managed "roles" as collections of permissions: We will not model "roles"
- internally in HBase to begin with. We instead allow group names to be granted
- permissions, which allows external modeling of roles via group membership. Groups are
- created and manipulated externally to HBase, via the Hadoop group mapping
- service.</para>
- </listitem>
- </orderedlist>
- <para> Access control mechanisms are mature and fairly standardized in the relational database
- world. The HBase implementation approximates current convention, but HBase has a simpler
- feature set than relational databases, especially in terms of client operations. We don't
- distinguish between an insert (new record) and update (of existing record), for example, as
- both collapse down into a Put. Accordingly, the important operations condense to four
- permissions: READ, WRITE, CREATE, and ADMIN. </para>
+ <title>Visibility Labels</title>
+ <para>Visibility labels control can be used to only permit users or principals associated with
+ a given label to read or access cells with that label. For instance, you might label a cell
+ <literal>top-secret</literal>, and only grant access to that label to the
+ <literal>managers</literal> group. Visibility labels are implemented using Tags, which are
+ a feature of HFile v3, and allow you to store metadata on a per-cell basis. A label is a
+ string, and labels can be combined into expressions by using logical operators (&, |, or
+ !), and using parentheses for grouping. HBase does not do any kind of validation of
+ expressions beyond basic well-formedness. Visibility labels have no meaning on their own,
+ and may be used to denote sensitivity level, privilege level, or any other arbitrary
+ semantic meaning.</para>
+ <para>If a user's labels do not match a cell's label or expression, the user is
+ denied access to the cell.</para>
+ <para>In HBase 0.98.6 and newer, UTF-8 encoding is supported for visibility labels and
+ expressions. When creating labels using the <code>addLabels(conf, labels)</code> method
+ provided by the <code>org.apache.hadoop.hbase.security.visibility.VisibilityClient</code>
+ class and passing labels in Authorizations via Scan or Get, labels can contain UTF-8
+ characters, as well as the logical operators normally used in visibility labels, with normal
+ Java notations, without needing any escaping method. However, when you pass a CellVisibility
+ expression via a Mutation, you must enclose the expression with the
+ <code>CellVisibility.quote()</code> method if you use UTF-8 characters or logical
+ operators. See <code>TestExpressionParser</code> and the source file
+ <filename>hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestScan.java</filename>.
+ </para>
+ <para>A user adds visibility expressions to a cell during a Put operation. In the default
+ configuration, the user does not need to access to a label in order to label cells with it.
+ This behavior is controlled by the configuration option
+ <option>hbase.security.visibility.mutations.checkauths</option>. If you set this option to
+ <literal>true</literal>, the labels the user is modifying as part of the mutation must be
+ associated with the user, or the mutation will fail. Whether a user is authorized to read a
+ labelled cell is determined during a Get or Scan, and results which the user is not allowed
+ to read are filtered out. This incurs the same I/O penalty as if the results were returned,
+ but reduces load on the network.</para>
+ <para>Visibility labels can also be specified during Delete operations. For details about
+ visibility labels and Deletes, see <link
+ xlink:href="https://issues.apache.org/jira/browse/HBASE-10885">HBASE-10885</link>. </para>
+ <para>The user's effective label set is built in the RPC context when a request is first
+ received by the RegionServer. The way that users are associated with labels is pluggable.
+ The default plugin passes through labels specified in Authorizations added to the Get or
+ Scan and checks those against the calling user's authenticated labels list. When the client
+ passes labels for which the user is not authenticated, the default plugin drops them. You
+ can pass a subset of user authenticated labels via the
+ <code>Get#setAuthorizations(Authorizations(String,...))</code> and
+ <code>Scan#setAuthorizations(Authorizations(String,...));</code> methods. </para>
+ <para>Visibility label access checking is performed by the VisibilityController coprocessor.
+ You can use interface <code>VisibilityLabelService</code> to provide a custom implementation
+ and/or control the way that visibility labels are stored with cells. See the source file
+ <filename>hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithCustomVisLabService.java</filename>
+ for one example.</para>
+
+ <para>Visibility labels can be used in conjunction with ACLs.</para>
<table>
- <title>Operation To Permission Mapping</title>
- <tgroup
- cols="2"
- align="left"
- colsep="1"
- rowsep="1">
- <colspec
- colname="c1"
- align="center" />
- <colspec
- colname="c2"
- align="left" />
+ <title>Examples of Visibility Expressions</title>
+ <tgroup cols="2">
<thead>
<row>
- <entry>Permission</entry>
- <entry>Operation</entry>
+ <entry>Expression</entry>
+ <entry>Interpretation</entry>
</row>
</thead>
<tbody>
- <!-- READ -->
- <row>
- <entry>Read</entry>
- <entry>Get</entry>
- </row>
- <row>
- <entry />
- <entry>Exists</entry>
- </row>
- <row>
- <entry />
- <entry>Scan</entry>
- </row>
- <!-- WRITE -->
- <row>
- <entry>Write</entry>
- <entry>Put</entry>
- </row>
- <row>
- <entry />
- <entry>Delete</entry>
- </row>
- <row>
- <entry />
- <entry>Lock/UnlockRow</entry>
- </row>
- <row>
- <entry />
- <entry>IncrementColumnValue</entry>
- </row>
- <row>
- <entry />
- <entry>CheckAndDelete/Put</entry>
- </row>
- <!-- CREATE -->
- <row>
- <entry>Create</entry>
- <entry>Create</entry>
- </row>
- <row>
- <entry />
- <entry>Alter</entry>
- </row>
<row>
- <entry />
- <entry>Drop</entry>
- </row>
- <row>
- <entry />
- <entry>Bulk Load</entry>
- </row>
- <!-- ADMIN -->
- <row>
- <entry>Admin</entry>
- <entry>Enable/Disable</entry>
- </row>
- <row>
- <entry />
- <entry>Snapshot/Restore/Clone</entry>
- </row>
- <row>
- <entry />
- <entry>Split</entry>
+ <entry><screen>fulltime</screen></entry>
+ <entry><para>Allow accesss to users associated with the
+ <code>fulltime</code> label.</para></entry>
</row>
<row>
- <entry />
- <entry>Flush</entry>
+ <entry><screen>!public</screen></entry>
+ <entry><para>Allow access to users not associated with the
+ <code>public</code> label.</para></entry>
</row>
<row>
- <entry />
- <entry>Compact</entry>
- </row>
- <row>
- <entry />
- <entry>Major Compact</entry>
- </row>
- <row>
- <entry />
- <entry>Roll HLog</entry>
- </row>
- <row>
- <entry />
- <entry>Grant</entry>
- </row>
- <row>
- <entry />
- <entry>Revoke</entry>
- </row>
- <row>
- <entry />
- <entry>Shutdown</entry>
+ <entry><para>Allow access to users associated with either the
+ <code>secret</code> or <code>topsecret</code> label and not
+ associated with the <code>probationary</code> label.</para></entry>
</row>
</tbody>
</tgroup>
</table>
- <para> Permissions can be granted in any of the following scopes, though CREATE and ADMIN
- permissions are effective only at table scope. </para>
+ <section>
+ <title>Server-Side Configuration</title>
+ <procedure>
+ <step>
+ <para>As a prerequisite, perform the steps in <xref
+ linkend="security.data.basic.server.side"/>.</para></step>
+ <step>
+ <para>Install and configure the VisibilityController coprocessor by setting the
+ following properties in <filename>hbase-site.xml</filename>. These properties take a
+ list of class names.</para>
+ <programlisting language="xml"><![CDATA[
+<property>
+ <name>hbase.coprocessor.region.classes</name>
+ <value>org.apache.hadoop.hbase.security.visibility.VisibilityController</value>
+</property>
+<property>
+ <name>hbase.coprocessor.master.classes</name>
+ <value>org.apache.hadoop.hbase.security.visibility.VisibilityController</value>
+</property>
+ ]]></programlisting>
+ <note>
+ <para>If you use the AccessController and VisibilityController coprocessors together,
+ the AccessController must come first in the list, because with both components
+ active, the VisibilityController will delegate access control on its system tables
+ to the AccessController.</para>
+ </note>
+ </step>
+ <step>
+ <title>Adjust Configuration</title>
+ <para>By default, users can label cells with any label, including labels they are not
+ associated with, which means that a user can Put data that he cannot read. For
+ example, a user could label a cell with the (hypothetical) 'topsecret' label even if
+ the user is not associated with that label. If you only want users to be able to label
+ cells with labels they are associated with, set
+ <property>hbase.security.visibility.mutations.checkauths</property> to
+ <literal>true</literal>. In that case, the mutation will fail if it makes use of
+ labels the user is not associated with.</para>
+ </step>
+ <step>
+ <para>Distribute your configuration and restart your cluster for changes to take
+ effect.</para>
+ </step>
+ </procedure>
+ </section>
+ <section>
+ <title>Administration</title>
+ <para>Administration tasks can be performed using the HBase Shell or the Java API. For
+ defining the list of visibility labels and associating labels with users, the
+ HBase Shell is probably simpler.</para>
+ <caution>
+ <title>API Examples</title>
+ <para>Many of the Java API examples in this section are taken from the source file
+ <filename>hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabels.java</filename>.
+ Refer to that file or the API documentation for more context.</para>
+ <para>Neither these examples, nor the source file they were taken from, are part of the
+ public HBase API, and are provided for illustration only. Refer to the official API
+ for usage instructions.</para>
+ </caution>
+ <procedure>
+ <step>
+ <title>Define the List of Visibility Labels</title>
+ <example>
+ <title>HBase Shell</title>
+ <screen>hbase< <userinput>add_labels [ 'admin', 'service', 'developer', 'test' ]</userinput></screen>
+ </example>
+ <example>
+ <title>Java API</title>
+ <programlisting language="java"><![CDATA[
+public static void addLabels() throws Exception {
+ PrivilegedExceptionAction<VisibilityLabelsResponse> action =
+ new PrivilegedExceptionAction<VisibilityLabelsResponse>() {
+ public VisibilityLabelsResponse run() throws Exception {
+ String[] labels = { SECRET, TOPSECRET, CONFIDENTIAL, PUBLIC, PRIVATE, COPYRIGHT, ACCENT,
+ UNICODE_VIS_TAG, UC1, UC2 };
+ try {
+ VisibilityClient.addLabels(conf, labels);
+ } catch (Throwable t) {
+ throw new IOException(t);
+ }
+ return null;
+ }
+ };
+ SUPERUSER.runAs(action);
+}
+ ]]></programlisting>
+ </example>
+ </step>
+ <step>
+ <title>Associate Labels with Users</title>
+ <example>
+ <title>HBase Shell</title>
+ <screen>hbase< <userinput>set_auths 'service', [ 'service' ]</userinput></screen>
+ <screen>hbase< <userinput>set_auths 'testuser', [ 'test' ]</userinput></screen>
+ <screen>hbase< <userinput>set_auths 'qa', [ 'test', 'developer' ]</userinput></screen>
+ </example>
+ <example>
+ <title>Java API</title>
+ <programlisting language="java"><![CDATA[
+public void testSetAndGetUserAuths() throws Throwable {
+ final String user = "user1";
+ PrivilegedExceptionAction<Void> action = new PrivilegedExceptionAction<Void>() {
+ public Void run() throws Exception {
+ String[] auths = { SECRET, CONFIDENTIAL };
+ try {
+ VisibilityClient.setAuths(conf, auths, user);
+ } catch (Throwable e) {
+ }
+ return null;
+ }
+ ...
+ ]]></programlisting>
+ </example>
+ </step>
+ <step>
+ <title>Clear Labels From Users</title>
+ <example>
+ <title>HBase Shell</title>
+ <screen>hbase< <userinput>clear_auths 'service', [ 'service' ]</userinput></screen>
+ <screen>hbase< <userinput>clear_auths 'testuser', [ 'test' ]</userinput></screen>
+ <screen>hbase< <userinput>clear_auths 'qa', [ 'test', 'developer' ]</userinput></screen>
+ </example>
+ <example>
+ <title>Java API</title>
+ <programlisting language="java"><![CDATA[
+...
+auths = new String[] { SECRET, PUBLIC, CONFIDENTIAL };
+VisibilityLabelsResponse response = null;
+try {
+ response = VisibilityClient.clearAuths(conf, auths, user);
+} catch (Throwable e) {
+ fail("Should not have failed");
+...
+ ]]></programlisting>
+ </example>
+ </step>
+ <step>
+ <title>Apply a Label or Expression to a Cell</title>
+ <para>The label is only applied when data is written. The label is associated with a
+ given version of the cell.</para>
+ <example>
+ <title>HBase Shell</title>
+ <screen>hbase< <userinput>set_visibility 'user', 'admin|service|developer', \
+ { COLUMNS => 'i' }</userinput></screen>
+ <screen>hbase< <userinput>set_visibility 'user', 'admin|service', \
+ { COLUMNS => ' pii' }</userinput></screen>
+ <screen>hbase< <userinput>COLUMNS => [ 'i', 'pii' ], \
+ FILTER => "(PrefixFilter ('test'))" }</userinput></screen>
+ </example>
+ <note>
+ <para>HBase Shell support for applying labels or permissions to cells is for testing
+ and verification support, and should not be employed for production use because it
+ won't apply the labels to cells that don't exist yet. The correct way to apply cell
+ level labels is to do so in the application code when storing the values.</para>
+ </note>
+ <example>
+ <title>Java API</title>
+ <programlisting language="java"><![CDATA[
+static HTable createTableAndWriteDataWithLabels(TableName tableName, String... labelExps)
+ throws Exception {
+ HTable table = null;
+ try {
+ table = TEST_UTIL.createTable(tableName, fam);
+ int i = 1;
+ List<Put> puts = new ArrayList<Put>();
+ for (String labelExp : labelExps) {
+ Put put = new Put(Bytes.toBytes("row" + i));
+ put.add(fam, qual, HConstants.LATEST_TIMESTAMP, value);
+ put.setCellVisibility(new CellVisibility(labelExp));
+ puts.add(put);
+ i++;
+ }
+ table.put(puts);
+ } finally {
+ if (table != null) {
+ table.flushCommits();
+ }
+ }
+ ]]></programlisting>
+ </example>
+ </step>
+ </procedure>
+ </section>
+ <section>
+ <title>Implementing Your Own Visibility Label Algorithm</title>
+ <para>Interpreting the labels authenticated for a given get/scan request is a pluggable
+ algorithm. You can specify a custom plugin by using the property
+ <code>hbase.regionserver.scan.visibility.label.generator.class</code>. The default
+ implementation class is
+ <code>org.apache.hadoop.hbase.security.visibility.DefaultScanLabelGenerator</code>. You
+ can also configure a set of <code>ScanLabelGenerators</code> to be used by the system, as
+ a comma-separated list.</para>
+ </section>
+ </section>
- <itemizedlist>
- <listitem>
- <para>Table</para>
- <para>
- <itemizedlist>
- <listitem>
- <para>Read: User can read from any column family in table</para>
- </listitem>
- <listitem>
- <para>Write: User can write to any column family in table</para>
- </listitem>
- <listitem>
- <para>Create: User can alter table attributes; add, alter, or drop column families;
- and drop the table.</para>
- </listitem>
- <listitem>
- <para>Admin: User can alter table attributes; add, alter, or drop column families;
- and enable, disable, or drop the table. User can also trigger region
- (re)assignments or relocation.</para>
- </listitem>
- </itemizedlist>
- </para>
- </listitem>
- <listitem>
- <para>Column Family</para>
- <para>
- <itemizedlist>
- <listitem>
- <para>Read: User can read from the column family</para>
- </listitem>
- <listitem>
- <para>Write: User can write to the column family</para>
- </listitem>
- </itemizedlist>
- </para>
- </listitem>
- </itemizedlist>
+ <section xml:id="hbase.encryption.server">
+ <title>Transparent Encryption of Data At Rest</title>
+ <para>HBase provides a mechanism for protecting your data at rest, in HFiles and the WAL, which
+ reside within HDFS or another distributed filesystem. A two-tier architecture is used for
+ flexible and non-intrusive key rotation. "Transparent" means that no implementation changes
+ are needed on the client side. When data is written, it is encrypted. When it is read, it is
+ decrypted on demand.</para>
+ <section>
+ <title>How It Works</title>
+ <para>The administrator provisions a master key for the cluster, which is stored in a key
+ provider accessible to every trusted HBase process, including the HMaster, RegionServers,
+ and clients (such as HBase Shell) on administrative workstations. The default key provider
+ is integrated with the Java KeyStore API and any key management systems with support for
+ it. Other custom key provider implementations are possible. The key retrieval mechanism is
+ configured in the <filename>hbase-site.xml</filename> configuration file. The master key
+ may be stored on the cluster servers, protected by a secure KeyStore file, or on an
+ external keyserver, or in a hardware security module. This master key is resolved as
+ needed by HBase processes through the configured key provider.</para>
+ <para>Next, encryption use can be specified in the schema, per column family, by creating
+ or modifying a column descriptor to include two additional attributes: the name of the
+ encryption algorithm to use (currently only "AES" is supported), and optionally, a data
+ key wrapped (encrypted) with the cluster master key. If a data key is not explictly
+ configured for a ColumnFamily, HBase will create a random data key per HFile. This
+ provides an incremental improvement in security over the alternative. Unless you need to
+ supply an explicit data key, such as in a case where you are generating encrypted HFiles
+ for bulk import with a given data key, only specify the encryption algorithm in the
+ ColumnFamily schema metadata and let HBase create data keys on demand. Per Column Family
+ keys facilitate low impact incremental key rotation and reduce the scope of any external
+ leak of key material. The wrapped data key is stored in the ColumnFamily schema metadata,
+ and in each HFile for the Column Family, encrypted with the cluster master key. After the
+ Column Family is configured for encryption, any new HFiles will be written encrypted. To
+ ensure encryption of all HFiles, trigger a major compaction after enabling this
+ feature.</para>
+ <para>When the HFile is opened, the data key is extracted from the HFile, decrypted with the
+ cluster master key, and used for decryption of the remainder of the HFile. The HFile will
+ be unreadable if the master key is not available. If a remote user somehow acquires access
+ to the HFile data because of some lapse in HDFS permissions, or from inappropriately
+ discarded media, it will not be possible to decrypt either the data key or the file
+ data.</para>
+ <para>It is also possible to encrypt the WAL. Even though WALs are transient, it is
+ necessary to encrypt the WALEdits to avoid circumventing HFile protections for encrypted
+ column families, in the event that the underlying filesystem is compromised. When WAL
+ encryption is enabled, all WALs are encrypted, regardless of whether the relevant HFiles
+ are encrypted.</para>
+ </section>
+ <section>
+ <title>Server-Side Configuration</title>
+ <para>This procedure assumes you are using the default Java keystore implementation. If you
+ are using a custom implementation, check its documentation and adjust accordingly.</para>
+ <procedure>
+ <step>
+ <title>Create a secret key of appropriate length for AES encryption, using the
+ <code>keytool</code> utility.</title>
+ <screen>$ <userinput>keytool -keystore /path/to/hbase/conf/hbase.jks \
+ -storetype jceks -storepass **** \
+ -genseckey -keyalg AES -keysize 128 \
+ -alias <alias></userinput></screen>
+ <para>Replace <replaceable>****</replaceable> with the password for the keystore file
+ and <alias> with the username of the HBase service account, or an arbitrary
+ string. If you use an arbitrary string, you will need to configure HBase to use it,
+ and that is covered below. Specify a keysize that is appropriate. Do not specify a
+ separate password for the key, but press <keycap>Return</keycap> when prompted.</para>
+ </step>
+ <step>
+ <title>Set appropriate permissions on the keyfile and distribute it to all the HBase
+ servers.</title>
+ <para>The previous command created a file called <filename>hbase.jks</filename> in the
+ HBase <filename>conf/</filename> directory. Set the permissions and ownership on this
+ file such that only the HBase service account user can read the file, and securely
+ distribute the key to all HBase servers.</para>
+ </step>
+ <step>
+ <title>Configure the HBase daemons.</title>
+ <para>Set the following properties in <filename>hbase-site.xml</filename> on the region
+ servers, to configure HBase daemons to use a key provider backed by the KeyStore file
+ or retrieving the cluster master key. In the example below, replace
+ <replaceable>****</replaceable> with the password.</para>
+ <programlisting language="xml"><![CDATA[
+<property>
+ <name>hbase.crypto.keyprovider</name>
+ <value>org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider</value>
+</property>
+<property>
+ <name>hbase.crypto.keyprovider.parameters</name>
+ <value>jceks:///path/to/hbase/conf/hbase.jks?password=****</value>
+</property>
+ ]]></programlisting>
+ <para>By default, the HBase service account name will be used to resolve the cluster
+ master key. However, you can store it with an arbitrary alias (in the
+ <command>keytool</command> command). In that case, set the following property to the
+ alias you used.</para>
+ <programlisting language="xml"><![CDATA[
+<property>
+ <name>hbase.crypto.master.key.name</name>
+ <value>my-alias</value>
+</property>]]>
+ </programlisting>
+ <para>You also need to be sure your HFiles use HFile v3, in order to use transparent
+ encryption. This is the default configuration for HBase 1.0 onward. For previous
+ versions, set the following property in your <filename>hbase-site.xml</filename>
+ file.</para>
+ <programlisting language="xml"><![CDATA[
+<property>
+ <name>hfile.format.version</name>
+ <value>3</value>
+</property>]]>
+ </programlisting>
+ <para>Optionally, you can use a different cipher provider, either a Java Cryptography
+ Encryption (JCE) algorithm provider or a custom HBase cipher implementation. </para>
+ <substeps>
+ <step>
+ <title>JCE: </title>
+ <itemizedlist>
+ <listitem>
+ <para>Install a signed JCE provider (supporting “AES/CTR/NoPadding” mode with
+ 128 bit keys) </para>
+ </listitem>
+ <listitem>
+ <para>Add it with highest preference to the JCE site configuration file
+ <filename>$JAVA_HOME/lib/security/java.security</filename>.</para>
+ </listitem>
+ <listitem>
+ <para>Update <option>hbase.crypto.algorithm.aes.provider</option> and
+ <option>hbase.crypto.algorithm.rng.provider</option> options in
+ <filename>hbase-site.xml</filename>. </para>
+ </listitem>
+ </itemizedlist>
+ </step>
+ <step>
+ <title>Custom HBase Cipher: </title>
+ <itemizedlist>
+ <listitem>
+ <para>Implement
+ <code>org.apache.hadoop.hbase.io.crypto.CipherProvider</code>.</para>
+ </listitem>
+ <listitem>
+ <para>Add the implementation to the server classpath.</para>
+ </listitem>
+ <listitem>
+ <para>Update <option>hbase.crypto.cipherprovider</option> in
+ <filename>hbase-site.xml</filename>.</para>
+ </listitem>
+ </itemizedlist>
+ </step>
+ </substeps>
+ </step>
+ <step>
+ <title>Configure WAL encryption.</title>
+ <para>Configure WAL encryption in every RegionServer's
+ <filename>hbase-site.xml</filename>, by setting the following properties. You can
+ include these in the HMaster's <filename>hbase-site.xml</filename> as well, but the
+ HMaster does not have a WAL and will not use them.</para>
+ <programlisting language="xml"><![CDATA[
+<property>
+ <name>hbase.regionserver.hlog.reader.impl</name>
+ <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader</value>
+</property>
+<property>
+ <name>hbase.regionserver.hlog.writer.impl</name>
+ <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter</value>
+</property>
+<property>
+ <name>hbase.regionserver.wal.encryption</name>
+ <value>true</value>
+</property>
+ ]]></programlisting>
+ </step>
+ <step>
+ <title>Configure permissions on the <filename>hbase-site.xml</filename> file.</title>
+ <para>Because the keystore password is stored in the hbase-site.xml, you need to ensure
+ that only the HBase user can read the <filename>hbase-site.xml</filename> file, using
+ file ownership and permissions.</para>
+ </step>
+ <step>
+ <title>Restart your cluster.</title>
+ <para>Distribute the new configuration file to all nodes and restart your
+ cluster.</para>
+ </step>
+ </procedure>
+ </section>
+ <section>
+ <title>Administration</title>
+ <para>Administrative tasks can be performed in HBase Shell or the Java API.</para>
+ <caution>
+ <title>Java API</title>
+ <para>Java API examples in this section are taken from the source file
+ <filename>hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsckEncryption.java</filename>.
+ .</para>
+ <para>Neither these examples, nor the source files they are taken from, are part of the
+ public HBase API, and are provided for illustration only. Refer to the official API
+ for usage instructions.</para>
+ </caution>
+ <variablelist>
+ <varlistentry>
+ <term>Enable Encryption on a Column Family</term>
+ <listitem>
+ <para>To enable encryption on a column family, you can either use HBase Shell or the
+ Java API. After enabling encryption, trigger a major compaction. When the major
+ compaction completes, the HFiles will be encrypted.</para>
+ <example>
+ <title>HBase Shell</title>
+ <screen>
+hbase> disable 'mytable'
+hbase> alter 'mytable', 'mycf', {ENCRYPTION => AES}
+hbase> enable 'mytable'
+ </screen>
+ </example>
+ <example>
+ <title>Java API</title>
+ <para>You can use the <code>HBaseAdmin#modifyColumn</code> API to modify the
+ <property>ENCRYPTION</property> attribute on a Column Family. Additionally, you
+ can specify the specific key to use as the wrapper, by setting the
+ <property>ENCRYPTION_KEY</property> attribute. This is only possible via the
+ Java API, and not the HBase Shell. The default behavior if you do not specify an
+ <property>ENCRYPTION_KEY</property> for a column family is for a random key to
+ be generated for each encrypted column family (per HFile). This provides
+ additional defense in the (unlikely, but theoretically possible) occurrence of
+ storing the same data in multiple HFiles with exactly the same block layout, the
+ same data key, and the same randomly-generated initialization vector.</para>
+ <para>This example shows how to programmatically set the transparent encryption both
+ in the server configuration and at the column family, as part of a test which uses
+ the Minicluster configuration.</para>
+ <programlisting language="java">
+@Before
+public void setUp() throws Exception {
+ conf = TEST_UTIL.getConfiguration();
+ conf.setInt("hfile.format.version", 3);
+ conf.set(HConstants.CRYPTO_KEYPROVIDER_CONF_KEY, KeyProviderForTesting.class.getName());
+ conf.set(HConstants.CRYPTO_MASTERKEY_NAME_CONF_KEY, "hbase");
+
+ // Create the test encryption key
+ SecureRandom rng = new SecureRandom();
+ byte[] keyBytes = new byte[AES.KEY_LENGTH];
+ rng.nextBytes(keyBytes);
+ cfKey = new SecretKeySpec(keyBytes, "AES");
- <para> There is also an implicit global scope for the superuser. </para>
- <para> The superuser is a principal, specified in the HBase site configuration file, that has
- equivalent access to HBase as the 'root' user would on a UNIX derived system. Normally this
- is the principal that the HBase processes themselves authenticate as. Although future
- versions of HBase Access Control may support multiple superusers, the superuser privilege
- will always include the principal used to run the HMaster process. Only the superuser is
- allowed to create tables, switch the balancer on or off, or take other actions with global
- consequence. Furthermore, the superuser has an implicit grant of all permissions to all
- resources. </para>
- <para> Tables have a new metadata attribute: OWNER, the user principal who owns the table. By
- default this will be set to the user principal who creates the table, though it may be
- changed at table creation time or during an alter operation by setting or changing the OWNER
- table attribute. Only a single user principal can own a table at a given time. A table owner
- will have all permissions over a given table. </para>
+ // Start the minicluster
+ TEST_UTIL.startMiniCluster(3);
+
+ // Create the table
+ htd = new HTableDescriptor(TableName.valueOf("default", "TestHBaseFsckEncryption"));
+ HColumnDescriptor hcd = new HColumnDescriptor("cf");
+ hcd.setEncryptionType("AES");
+ hcd.setEncryptionKey(EncryptionUtil.wrapKey(conf,
+ conf.get(HConstants.CRYPTO_MASTERKEY_NAME_CONF_KEY, User.getCurrent().getShortName()),
+ cfKey));
+ htd.addFamily(hcd);
+ TEST_UTIL.getHBaseAdmin().createTable(htd);
+ TEST_UTIL.waitTableAvailable(htd.getName(), 5000);
+}
+ </programlisting>
+ </example>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Rotate the Data Key</term>
+ <listitem>
+ <para>To rotate the data key, first change the ColumnFamily key in the column
+ descriptor, then trigger a major compaction. When compaction is complete, all HFiles
+ will be re-encrypted using the new data key. Until the compaction completes, the
+ old HFiles will still be readable using the old key.</para>
+ <para>If you rely on HBase's default behavior of generating a random key for each
+ HFile, there is no need to rotate data keys. A major compaction will re-encrypt the
+ HFile with a new key.</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Switching Between Using a Random Data Key and Specifying A Key</term>
+ <listitem>
+ <para>If you configured a column family to use a specific key and you want to return
+ to the default behavior of using a randomly-generated key for that column family,
+ use the Java API to alter the <code>HColumnDescriptor</code> so that no value is
+ sent with the key <literal>ENCRYPTION_KEY</literal>.</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Rotate the Master Key</term>
+ <listitem>
+ <para>To rotate the master key, first generate and distribute the new key. Then update
+ the KeyStore to contain a new master key, and keep the old master key in the
+ KeyStore using a different alias. Next, configure fallback to the old master key in
+ the <filename>hbase-site.xml</filename> file.</para>
+ <programlisting language="xml"><![CDATA[
+<property>
+ <name>hbase.crypto.master.alternate.key.name</name>
+ <value>hbase.old</value>
+</property>
+ ]]></programlisting>
+ <para>Rolling restart your cluster for this change to take effect. Trigger a major
+ compaction on each table. At the end of the major compaction, all HFiles will be
+ re-encrypted with data keys wrapped by the new cluster key. At this point, you can
+ remove the old master key from the KeyStore, remove the configuration for the
+ fallback master key from the <filename>hbase-site.xml</filename>, and perform a
+ second rolling restart at some point. This second rolling restart is not
+ time-sensitive.</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term></term>
+ <listitem>
+ <para></para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </section>
</section>
- <section>
- <title>Access Control Matrix</title>
- <para>The following matrix shows the minimum permission set required to perform operations in
- HBase. Before using the table, read through the information about how to interpret it.</para>
- <variablelist>
- <title>Interpreting the ACL Matrix Table</title>
- <para>The following conventions are used in the ACL Matrix table:</para>
- <varlistentry>
- <term>Scopes</term>
- <listitem>
- <para>Permissions are evaluated starting at the widest scope and working to the
- narrowest scope. A scope corresponds to a level of the data model. From broadest to
- narrowest, the scopes are as follows::</para>
- <itemizedlist>
- <listitem><para>Global</para></listitem>
- <listitem><para>Namespace (NS)</para></listitem>
- <listitem><para>Table</para></listitem>
- <listitem><para>Column Qualifier (CF)</para></listitem>
- <listitem><para>Column Family (CQ)</para></listitem>
- <listitem><para>Cell</para></listitem>
- </itemizedlist>
- <para>For instance, a permission granted at table level dominates any grants done at the
- ColumnFamily, ColumnQualifier, or cell level. The user can do what that grant implies
- at any location in the table. A permission granted at global scope dominates all: the
- user is always allowed to take that action everywhere.</para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Permissions</term>
- <listitem>
- <para>Possible permissions include the following:</para>
- <itemizedlist>
- <listitem><para>Superuser - a special user that belongs to group "supergroup" and has
- unlimited access</para></listitem>
- <listitem><para>Admin (A)</para></listitem>
- <listitem><para>Create (C)</para></listitem>
- <listitem><para>Write (W)</para></listitem>
- <listitem><para>Read (R)</para></listitem>
- <listitem><para>Execute (X)</para></listitem>
- </itemizedlist>
- </listitem>
- </varlistentry>
- </variablelist>
- <para>For the most part, permissions work in an expected way, with the following caveats:</para>
+ <section
+ xml:id="hbase.secure.bulkload">
+ <title>Secure Bulk Load</title>
+ <para> Bulk loading in secure mode is a bit more involved than normal setup, since the client
+ has to transfer the ownership of the files generated from the mapreduce job to HBase. Secure
+ bulk loading is implemented by a coprocessor, named <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/security/access/SecureBulkLoadEndpoint.html"
+ >SecureBulkLoadEndpoint</link>, which uses a staging directory configured by the
+ configuration property <option>hbase.bulkload.staging.dir</option>, which defaults to
+ <filename>/tmp/hbase-staging/</filename>.</para>
<itemizedlist>
+ <title>Secure Bulk Load Algorithm</title>
<listitem>
- <para>Having Write permission does not imply Read permission. It is possible and sometimes
- desirable for a user to be able to write data that same user cannot read. One such example
- is a log-writing process.</para>
- </listitem>
- <listitem>
- <para>Admin is a superset of Create, so a user with Admin permissions does not also need
- Create permissions to perform an action such as creating a table.</para>
+ <para>One time only, create a staging directory which is world-traversable and owned by
+ the user which runs HBase (mode 711, or <literal>rwx--x--x</literal>). A listing of this
+ directory will look similar to the following: </para>
+ <screen>$ <userinput>ls -ld /tmp/hbase-staging</userinput>
+drwx--x--x 2 hbase hbase 68 3 Sep 14:54 /tmp/hbase-staging
+ </screen>
</listitem>
<listitem>
- <para>The <systemitem>hbase:meta</systemitem> table is readable by every user, regardless
- of the user's other grants or restrictions. This is a requirement for HBase to
- function correctly.</para>
+ <para>A user writes out data to a secure output directory owned by that user. For example,
+ <filename>/user/foo/data</filename>.</para>
</listitem>
<listitem>
- <para>Users with Create or Admin permissions are granted Write permission on meta regions,
- so the table operations they are allowed to perform can complete, even if technically
- the bits can be granted separately in any possible combination.</para>
+ <para>Internally, HBase creates a secret staging directory which is globally
+ readable/writable (<code>-rwxrwxrwx, 777</code>). For example,
+ <filename>/tmp/hbase-staging/averylongandrandomdirectoryname</filename>. The name and
+ location of this directory is not exposed to the user. HBase manages creation and
+ deletion of this directory.</para>
</listitem>
<listitem>
- <para><code>CheckAndPut</code> and <code>CheckAndDelete</code> operations will fail if the user does not have both
- Write and Read permission.</para>
- </listitem>
- <listitem>
- <para><code>Increment</code> and <code>Append</code> operations do not require Read access.</para>
+ <para>The user makes the data world-readable and world-writable, moves it into the random
+ staging directory, then calls the <code>SecureBulkLoadClient#bulkLoadHFiles</code>
+ method.</para>
</listitem>
</itemizedlist>
- <para>The following table is sorted by the interface that provides each operation. In case the
- table goes out of date, the unit tests which check for accuracy of permissions can be found
- in
- <filename>hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java</filename>,
- and the access controls themselves can be examined in
- <filename>hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java</filename>.</para>
-
- <table
- frame="all">
- <title>ACL Matrix</title>
- <tgroup
- cols="4">
- <thead>
- <row>
- <entry>Interface</entry>
- <entry>Operation</entry>
- <entry>Minimum Scope</entry>
- <entry>Minimum Permission</entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry
- morerows="27">
- <!-- incrememt this if you add another "master" operation -->
- <para>Master</para>
- </entry>
- <entry>
- <para>createTable</para>
- </entry>
- <entry>
- <para>Global</para>
- </entry>
- <entry>
- <para>A</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>modifyTable</para>
- </entry>
- <entry>
- <para>Table</para>
- </entry>
- <entry>
- <para>A|CW</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>deleteTable</para>
- </entry>
- <entry>
- <para>Table</para>
- </entry>
- <entry>
- <para>A|CW</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>truncateTable</para>
- </entry>
- <entry>
- <para>Table</para>
- </entry>
- <entry>
- <para>A|CW</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>addColumn</para>
- </entry>
- <entry>
- <para>Table</para>
- </entry>
- <entry>
- <para>A|CW</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>modifyColumn</para>
- </entry>
- <entry>
- <para>Table</para>
- </entry>
- <entry>
- <para>A|CW</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>deleteColumn</para>
- </entry>
- <entry>
- <para>Table</para>
- </entry>
- <entry>
- <para>A|CW</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>disableTable</para>
- </entry>
- <entry>
- <para>Table</para>
- </entry>
- <entry>
- <para>A|CW</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>disableAclTable</para>
- </entry>
- <entry>
- <para>None</para>
- </entry>
- <entry>
- <para>Not allowed</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>enableTable</para>
- </entry>
- <entry>
- <para>Table</para>
- </entry>
- <entry>
- <para>A|CW</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>move</para>
- </entry>
- <entry>
- <para>Global</para>
- </entry>
- <entry>
- <para>A</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>assign</para>
- </entry>
- <entry>
- <para>Global</para>
- </entry>
- <entry>
- <para>A</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>unassign</para>
- </entry>
- <entry>
- <para>Global</para>
- </entry>
- <entry>
- <para>A</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>regionOffline</para>
- </entry>
- <entry>
- <para>Global</para>
- </entry>
- <entry>
- <para>A</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>balance</para>
- </entry>
- <entry>
- <para>Global</para>
- </entry>
- <entry>
- <para>A</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>balanceSwitch</para>
- </entry>
- <entry>
- <para>Global</para>
- </entry>
- <entry>
- <para>A</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>shutdown</para>
- </entry>
- <entry>
- <para>Global</para>
- </entry>
- <entry>
- <para>A</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>stopMaster</para>
- </entry>
- <entry>
- <para>Global</para>
- </entry>
- <entry>
- <para>A</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>snapshot</para>
- </entry>
- <entry>
- <para>Global</para>
- </entry>
- <entry>
- <para>A</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>clone</para>
- </entry>
- <entry>
- <para>Global</para>
- </entry>
- <entry>
- <para>A</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>restore</para>
- </entry>
- <entry>
- <para>Global</para>
- </entry>
- <entry>
- <para>A</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>deleteSnapshot</para>
- </entry>
- <entry>
- <para>Global</para>
- </entry>
- <entry>
- <para>A</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>createNamespace</para>
- </entry>
- <entry>
- <para>Global</para>
- </entry>
- <entry>
- <para>A</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>deleteNamespace</para>
- </entry>
- <entry>
- <para>Namespace</para>
- </entry>
- <entry>
- <para>A</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>modifyNamespace</para>
- </entry>
- <entry>
- <para>Namespace</para>
- </entry>
- <entry>
- <para>A</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>flushTable</para>
- </entry>
- <entry>
- <par
<TRUNCATED>
[2/2] git commit: HBASE-11791 Update docs on visibility tags and ACLs,
transparent encryption, secure bulk upload
Posted by mi...@apache.org.
HBASE-11791 Update docs on visibility tags and ACLs, transparent encryption, secure bulk upload
Project: http://git-wip-us.apache.org/repos/asf/hbase/repo
Commit: http://git-wip-us.apache.org/repos/asf/hbase/commit/38bc5360
Tree: http://git-wip-us.apache.org/repos/asf/hbase/tree/38bc5360
Diff: http://git-wip-us.apache.org/repos/asf/hbase/diff/38bc5360
Branch: refs/heads/master
Commit: 38bc5360c598e632b9e901c34ba261be6eb43580
Parents: 989c626
Author: Misty Stanley-Jones <ms...@cloudera.com>
Authored: Thu Oct 2 09:21:57 2014 +1000
Committer: Misty Stanley-Jones <ms...@cloudera.com>
Committed: Tue Oct 7 17:22:02 2014 +1000
----------------------------------------------------------------------
src/main/docbkx/book.xml | 3 +-
src/main/docbkx/security.xml | 2558 +++++++++++++++++++------------------
2 files changed, 1324 insertions(+), 1237 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/hbase/blob/38bc5360/src/main/docbkx/book.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/book.xml b/src/main/docbkx/book.xml
index eea00d6..b2cf1dd 100644
--- a/src/main/docbkx/book.xml
+++ b/src/main/docbkx/book.xml
@@ -5425,6 +5425,7 @@ This option should not normally be used, and it is not in <code>-fixAll</code>.
</section>
</appendix>
+ <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="appendix_acl_matrix.xml" />
<appendix
xml:id="compression">
@@ -5490,7 +5491,7 @@ This option should not normally be used, and it is not in <code>-fixAll</code>.
</itemizedlist>
- <itemizedlist>
+ <itemizedlist xml:id="data.block.encoding.types">
<title>Data Block Encoding Types</title>
<listitem>
<para>Prefix - Often, keys are very similar. Specifically, keys often share a common prefix