You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by el...@apache.org on 2017/06/16 18:59:38 UTC

hbase git commit: HBASE-17840 Update hbase book to space quotas on snapshots

Repository: hbase
Updated Branches:
  refs/heads/master c7a64a831 -> 4dc805145


HBASE-17840 Update hbase book to space quotas on snapshots


Project: http://git-wip-us.apache.org/repos/asf/hbase/repo
Commit: http://git-wip-us.apache.org/repos/asf/hbase/commit/4dc80514
Tree: http://git-wip-us.apache.org/repos/asf/hbase/tree/4dc80514
Diff: http://git-wip-us.apache.org/repos/asf/hbase/diff/4dc80514

Branch: refs/heads/master
Commit: 4dc805145b1d089a5c75d212bec922c1f6cf5fc5
Parents: c7a64a8
Author: Josh Elser <el...@apache.org>
Authored: Wed May 31 15:02:32 2017 -0400
Committer: Josh Elser <el...@apache.org>
Committed: Fri Jun 16 11:24:31 2017 -0700

----------------------------------------------------------------------
 src/main/asciidoc/_chapters/ops_mgt.adoc | 45 +++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hbase/blob/4dc80514/src/main/asciidoc/_chapters/ops_mgt.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/ops_mgt.adoc b/src/main/asciidoc/_chapters/ops_mgt.adoc
index b26e44b..6181b13 100644
--- a/src/main/asciidoc/_chapters/ops_mgt.adoc
+++ b/src/main/asciidoc/_chapters/ops_mgt.adoc
@@ -1964,6 +1964,51 @@ In these cases, the user may configure the system to not delete any space quota
   </property>
 ----
 
+=== HBase Snapshots with Space Quotas
+
+One common area of unintended-filesystem-use with HBase is via HBase snapshots. Because snapshots
+exist outside of the management of HBase tables, it is not uncommon for administrators to suddenly
+realize that hundreds of gigabytes or terabytes of space is being used by HBase snapshots which were
+forgotten and never removed.
+
+link:https://issues.apache.org/jira/browse/HBASE-17748[HBASE-17748] is the umbrella JIRA issue which
+expands on the original space quota functionality to also include HBase snapshots. While this is a confusing
+subject, the implementation attempts to present this support in as reasonable and simple of a manner as
+possible for administrators. This feature does not make any changes to administrator interaction with
+space quotas, only in the internal computation of table/namespace usage. Table and namespace usage will
+automatically incorporate the size taken by a snapshot per the rules defined below.
+
+As a review, let's cover a snapshot's lifecycle: a snapshot is metadata which points to
+a list of HFiles on the filesystem. This is why creating a snapshot is a very cheap operation; no HBase
+table data is actually copied to perform a snapshot. Cloning a snapshot into a new table or restoring
+a table is a cheap operation for the same reason; the new table references the files which already exist
+on the filesystem without a copy. To include snapshots in space quotas, we need to define which table
+"owns" a file when a snapshot references the file ("owns" refers to encompassing the filesystem usage
+of that file).
+
+Consider a snapshot which was made against a table. When the snapshot refers to a file and the table no
+longer refers to that file, the "originating" table "owns" that file. When multiple snapshots refer to
+the same file and no table refers to that file, the snapshot with the lowest-sorting name (lexicographically)
+is chosen and the table which that snapshot was created from "owns" that file. HFiles are not "double-counted"
+ hen a table and one or more snapshots refer to that HFile.
+
+When a table is "rematerialized" (via `clone_snapshot` or `restore_snapshot`), a similar problem of file
+ownership arises. In this case, while the rematerialized table references a file which a snapshot also
+references, the table does not "own" the file. The table from which the snapshot was created still "owns"
+that file. When the rematerialized table is compacted or the snapshot is deleted, the rematerialized table
+will uniquely refer to a new file and "own" the usage of that file. Similarly, when a table is duplicated via a snapshot
+and `restore_snapshot`, the new table will not consume any quota size until the original table stops referring
+to the files, either due to a compaction on the original table, a compaction on the new table, or the
+original table being deleted.
+
+One new HBase shell command was added to inspect the computed sizes of each snapshot in an HBase instance.
+
+----
+hbase> list_snapshot_sizes
+SNAPSHOT                                      SIZE
+ t1.s1                                        1159108
+----
+
 [[ops.backup]]
 == HBase Backup