You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by nd...@apache.org on 2020/03/07 18:42:35 UTC

[hbase] branch branch-2 updated (6ebe966 -> 3e5d3a0)

This is an automated email from the ASF dual-hosted git repository.

ndimiduk pushed a change to branch branch-2
in repository https://gitbox.apache.org/repos/asf/hbase.git.


    from 6ebe966  HBASE-23739 BoundedRecoveredHFilesOutputSink should read the table descriptor directly (#1223)
     new 5872758  Revert "HBASE-18418 Remove apache_hbase_topology from dev-support"
     new 3e5d3a0  Revert "HBASE-23945 Dockerfiles showing hadolint check failures"

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 dev-support/Dockerfile                             |  10 +-
 .../apache_hbase_topology/Dockerfile               |   9 +-
 dev-support/apache_hbase_topology/README.md        |  49 +++
 .../__init__.py}                                   |   7 -
 dev-support/apache_hbase_topology/actions.py       | 421 +++++++++++++++++++++
 .../apache_hbase_topology/configurations.cfg       |  80 ++++
 dev-support/apache_hbase_topology/profile.cfg      |  82 ++++
 dev-support/apache_hbase_topology/ssh/id_rsa       |  44 +++
 .../apache_hbase_topology/ssh/id_rsa.pub           |   6 +-
 dev-support/hbase_docker/Dockerfile                |  12 +-
 10 files changed, 690 insertions(+), 30 deletions(-)
 copy hbase-mapreduce/src/main/resources/org/apache/hadoop/hbase/mapreduce/RowCounter_Counters.properties => dev-support/apache_hbase_topology/Dockerfile (75%)
 create mode 100644 dev-support/apache_hbase_topology/README.md
 copy dev-support/{python-requirements.txt => apache_hbase_topology/__init__.py} (94%)
 create mode 100644 dev-support/apache_hbase_topology/actions.py
 create mode 100644 dev-support/apache_hbase_topology/configurations.cfg
 create mode 100644 dev-support/apache_hbase_topology/profile.cfg
 create mode 100644 dev-support/apache_hbase_topology/ssh/id_rsa
 copy bin/hbase-jruby => dev-support/apache_hbase_topology/ssh/id_rsa.pub (67%)


[hbase] 02/02: Revert "HBASE-23945 Dockerfiles showing hadolint check failures"

Posted by nd...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ndimiduk pushed a commit to branch branch-2
in repository https://gitbox.apache.org/repos/asf/hbase.git

commit 3e5d3a0ab3c139608309eb0115d03a8c07bc173b
Author: Nick Dimiduk <nd...@gmail.com>
AuthorDate: Sat Mar 7 10:42:04 2020 -0800

    Revert "HBASE-23945 Dockerfiles showing hadolint check failures"
    
    This reverts commit 4205677eb73fd5f09d0ed8d8d957edca4ff4a59f.
---
 dev-support/Dockerfile              | 10 +++-------
 dev-support/hbase_docker/Dockerfile | 12 ++++--------
 2 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/dev-support/Dockerfile b/dev-support/Dockerfile
index 89b7726..5d79988 100644
--- a/dev-support/Dockerfile
+++ b/dev-support/Dockerfile
@@ -22,12 +22,8 @@
 # dev-support/flaky-tests/flaky-reporting.Jenkinsfile
 FROM ubuntu:18.04
 
-COPY . /hbase/dev-support
+ADD . /hbase/dev-support
 
-RUN DEBIAN_FRONTEND=noninteractive apt-get -qq -y update \
-    && DEBIAN_FRONTEND=noninteractive apt-get -qq -y install --no-install-recommends \
-      curl=7.58.0-2ubuntu3.8 \
-      python-pip=9.0.1-2.3~ubuntu1.18.04.1 \
-    && apt-get clean \
-    && rm -rf /var/lib/apt/lists/* \
+RUN apt-get -y update \
+    && apt-get -y install curl python-pip \
     && pip install -r /hbase/dev-support/python-requirements.txt
diff --git a/dev-support/hbase_docker/Dockerfile b/dev-support/hbase_docker/Dockerfile
index c018a30..1a5dfa3 100644
--- a/dev-support/hbase_docker/Dockerfile
+++ b/dev-support/hbase_docker/Dockerfile
@@ -17,25 +17,21 @@
 FROM ubuntu:14.04
 
 # Install Git, which is missing from the Ubuntu base images.
-RUN DEBIAN_FRONTEND=noninteractive apt-get -qq -y update \
-  && DEBIAN_FRONTEND=noninteractive apt-get -qq -y install --no-install-recommends \
-    git=1:1.9.1-1ubuntu0.10 \
-  && apt-get clean \
-  && rm -rf /var/lib/apt/lists/*
+RUN apt-get update && apt-get install -y git
 
 # Add the dependencies from the hbase_docker folder and delete ones we don't need.
 WORKDIR /root
-COPY . /root
+ADD . /root
 RUN find . -not -name "*tar.gz" -delete
 
 # Install Java.
 RUN mkdir -p /usr/java
-RUN tar xzf ./*jdk* --strip-components 1 -C /usr/java
+RUN tar xzf *jdk* --strip-components 1 -C /usr/java
 ENV JAVA_HOME /usr/java
 
 # Install Maven.
 RUN mkdir -p /usr/local/apache-maven
-RUN tar xzf ./*maven* --strip-components 1 -C /usr/local/apache-maven
+RUN tar xzf *maven* --strip-components 1 -C /usr/local/apache-maven
 ENV MAVEN_HOME /usr/local/apache-maven
 
 # Add Java and Maven to the path.


[hbase] 01/02: Revert "HBASE-18418 Remove apache_hbase_topology from dev-support"

Posted by nd...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ndimiduk pushed a commit to branch branch-2
in repository https://gitbox.apache.org/repos/asf/hbase.git

commit 5872758c6b6e68900b3af794d5ffe79098f729a4
Author: Nick Dimiduk <nd...@gmail.com>
AuthorDate: Sat Mar 7 10:42:04 2020 -0800

    Revert "HBASE-18418 Remove apache_hbase_topology from dev-support"
    
    This reverts commit d641726da55082b5ab8b4528382ea1953476d615.
---
 dev-support/apache_hbase_topology/Dockerfile       |  24 ++
 dev-support/apache_hbase_topology/README.md        |  49 +++
 dev-support/apache_hbase_topology/__init__.py      |  15 +
 dev-support/apache_hbase_topology/actions.py       | 421 +++++++++++++++++++++
 .../apache_hbase_topology/configurations.cfg       |  80 ++++
 dev-support/apache_hbase_topology/profile.cfg      |  82 ++++
 dev-support/apache_hbase_topology/ssh/id_rsa       |  44 +++
 dev-support/apache_hbase_topology/ssh/id_rsa.pub   |  18 +
 8 files changed, 733 insertions(+)

diff --git a/dev-support/apache_hbase_topology/Dockerfile b/dev-support/apache_hbase_topology/Dockerfile
new file mode 100644
index 0000000..714a55c
--- /dev/null
+++ b/dev-support/apache_hbase_topology/Dockerfile
@@ -0,0 +1,24 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+FROM debian:wheezy
+
+ENV TOPOLOGY_NAME=apache_hbase
+ADD . /root/clusterdock/clusterdock/topologies/${TOPOLOGY_NAME}
+
+RUN find /root -type f -name id_rsa -exec chmod 600 {} \;
+
+VOLUME /root/clusterdock/clusterdock/topologies/${TOPOLOGY_NAME}
+CMD ["/true"]
diff --git a/dev-support/apache_hbase_topology/README.md b/dev-support/apache_hbase_topology/README.md
new file mode 100644
index 0000000..018ee99
--- /dev/null
+++ b/dev-support/apache_hbase_topology/README.md
@@ -0,0 +1,49 @@
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+# apache_hbase clusterdock topology
+
+## Overview
+*clusterdock* is a framework for creating Docker-based container clusters. Unlike regular Docker
+containers, which tend to run single processes and then exit once the process terminates, these
+container clusters are characterized by the execution of an init process in daemon mode. As such,
+the containers act more like "fat containers" or "light VMs;" entities with accessible IP addresses
+which emulate standalone hosts.
+
+*clusterdock* relies upon the notion of a topology to define how clusters should be built into
+images and then what to do with those images to start Docker container clusters.
+
+## Usage
+The *clusterdock* framework is designed to be run out of its own container while affecting
+operations on the host. To avoid problems that might result from incorrectly
+formatting this framework invocation, a Bash helper script (`clusterdock.sh`) can be sourced on a
+host that has Docker installed. Afterwards, running any of the binaries intended to carry
+out *clusterdock* actions can be done using the `clusterdock_run` command.
+```
+wget https://raw.githubusercontent.com/cloudera/clusterdock/master/clusterdock.sh
+# ALWAYS INSPECT SCRIPTS FROM THE INTERNET BEFORE SOURCING THEM.
+source clusterdock.sh
+```
+
+Since the *clusterdock* framework itself lives outside of Apache HBase, an environmental variable
+is used to let the helper script know where to find an image of the *apache_hbase* topology. To
+start a four-node Apache HBase cluster with default versions, you would simply run
+```
+CLUSTERDOCK_TOPOLOGY_IMAGE=apache_hbase_topology_location clusterdock_run \
+    ./bin/start_cluster apache_hbase --secondary-nodes='node-{2..4}'
+```
diff --git a/dev-support/apache_hbase_topology/__init__.py b/dev-support/apache_hbase_topology/__init__.py
new file mode 100644
index 0000000..635f0d9
--- /dev/null
+++ b/dev-support/apache_hbase_topology/__init__.py
@@ -0,0 +1,15 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/dev-support/apache_hbase_topology/actions.py b/dev-support/apache_hbase_topology/actions.py
new file mode 100644
index 0000000..26566e0
--- /dev/null
+++ b/dev-support/apache_hbase_topology/actions.py
@@ -0,0 +1,421 @@
+#!/usr/bin/env python
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""The actions module for the apache_hbase topology. The behavior to be carried out by the
+build_cluster and start_cluster clusterdock scripts are to be defined through the build and
+start functions, respectively.
+"""
+
+import logging
+import tarfile
+from ConfigParser import ConfigParser
+from os import EX_OK, listdir, makedirs, remove # pylint: disable=ungrouped-imports
+                                                # Follow convention of grouping from module imports
+                                                # after normal imports.
+from os.path import exists, join
+from shutil import move
+from socket import getfqdn
+from sys import stdout
+from uuid import uuid4
+
+# pylint: disable=import-error
+# clusterdock topologies get access to the clusterdock package at run time, but their reference
+# to clusterdock modules will confuse pylint, so we have to disable it.
+
+import requests
+from docker import Client
+
+from clusterdock import Constants
+from clusterdock.cluster import Cluster, Node, NodeGroup
+from clusterdock.docker_utils import (build_image, get_clusterdock_container_id,
+                                      get_host_port_binding, is_image_available_locally, pull_image)
+from clusterdock.utils import strip_components_from_tar, XmlConfiguration
+
+# We disable a couple of Pylint conventions because it assumes that module level variables must be
+# named as if they're constants (which isn't the case here).
+logger = logging.getLogger(__name__) # pylint: disable=invalid-name
+logger.setLevel(logging.INFO)
+
+client = Client() # pylint: disable=invalid-name
+
+DEFAULT_APACHE_NAMESPACE = Constants.DEFAULT.apache_namespace # pylint: disable=no-member
+
+def _copy_container_folder_to_host(container_id, source_folder, destination_folder,
+                                   host_folder=None):
+    if not exists(destination_folder):
+        makedirs(destination_folder)
+    stream, _ = client.get_archive(container_id, source_folder)
+    tar_filename = join(destination_folder, 'container_folder.tar')
+    with open(tar_filename, 'wb') as file_descriptor:
+        file_descriptor.write(stream.read())
+    tar = tarfile.open(name=tar_filename)
+    tar.extractall(path=destination_folder, members=strip_components_from_tar(tar))
+    tar.close()
+    remove(tar_filename)
+    logger.info("Extracted container folder %s to %s.", source_folder,
+                host_folder if host_folder else destination_folder)
+
+def _create_configs_from_file(filename, cluster_config_dir, wildcards):
+    configurations = ConfigParser(allow_no_value=True)
+    configurations.read(filename)
+
+    for config_file in configurations.sections():
+        logger.info("Updating %s...", config_file)
+        # For XML configuration files, run things through XmlConfiguration.
+        if config_file.endswith('.xml'):
+            XmlConfiguration(
+                {item[0]: item[1].format(**wildcards)
+                 for item in configurations.items(config_file)}
+            ).write_to_file(join(cluster_config_dir, config_file))
+        # For everything else, recognize whether a line in the configuration should simply be
+        # appended to the bottom of a file or processed in some way. The presence of +++ will
+        # lead to the evaluation of the following string through the end of the line.
+        else:
+            lines = []
+            for item in configurations.items(config_file):
+                if item[0].startswith('+++'):
+                    command = item[0].lstrip('+ ').format(**wildcards)
+
+                    # Yes, we use eval here. This is potentially dangerous, but intention.
+                    lines.append(str(eval(command))) # pylint: disable=eval-used
+                elif item[0] == "body":
+                    lines.append(item[1].format(**wildcards))
+                else:
+                    lines.append(item[0].format(**wildcards))
+            with open(join(cluster_config_dir, config_file), 'w') as conf:
+                conf.write("".join(["{0}\n".format(line) for line in lines]))
+
+# Keep track of some common web UI ports that we'll expose to users later (e.g. to allow a user
+# to reach the HDFS NameNode web UI over the internet).
+HBASE_REST_SERVER_PORT = 8080
+NAMENODE_WEB_UI_PORT = 50070
+RESOURCEMANAGER_WEB_UI_PORT = 8088
+
+# When starting or building cluster, CLUSTERDOCK_VOLUME will be the root directory for persistent
+# files (note that this location will itself be in a Docker container's filesystem).
+CLUSTERDOCK_VOLUME = '/tmp/clusterdock'
+
+def start(args):
+    """This function will be executed when ./bin/start_cluster apache_hbase is invoked."""
+
+    # pylint: disable=too-many-locals
+    # Pylint doesn't want more than 15 local variables in a function; this one has 17. This is about
+    # as low as I want to go because, while I can cheat and stuff unrelated things in a dictionary,
+    # that won't improve readability.
+
+    uuid = str(uuid4())
+    container_cluster_config_dir = join(CLUSTERDOCK_VOLUME, uuid, 'config')
+    makedirs(container_cluster_config_dir)
+
+    for mount in client.inspect_container(get_clusterdock_container_id())['Mounts']:
+        if mount['Destination'] == CLUSTERDOCK_VOLUME:
+            host_cluster_config_dir = join(mount['Source'], uuid, 'config')
+            break
+    else:
+        raise Exception("Could not find source of {0} mount.".format(CLUSTERDOCK_VOLUME))
+
+    # CLUSTERDOCK_VOLUME/uuid/config in the clusterdock container corresponds to
+    # host_cluster_config_dir on the Docker host.
+    logger.debug("Creating directory for cluster configuration files in %s...",
+                 host_cluster_config_dir)
+
+    # Generate the image name to use from the command line arguments passed in.
+    image = '/'.join(
+        [item
+         for item in [args.registry_url, args.namespace or DEFAULT_APACHE_NAMESPACE,
+                      "clusterdock:{os}_java-{java}_hadoop-{hadoop}_hbase-{hbase}".format(
+                          os=args.operating_system, java=args.java_version,
+                          hadoop=args.hadoop_version, hbase=args.hbase_version
+                      )]
+         if item]
+    )
+    if args.always_pull or not is_image_available_locally(image):
+        pull_image(image)
+
+    # Before starting the cluster, we create a throwaway container from which we copy
+    # configuration files back to the host. We also use this container to run an HBase
+    # command that returns the port of the HBase master web UI. Since we aren't running init here,
+    # we also have to manually pass in JAVA_HOME as an environmental variable.
+    get_hbase_web_ui_port_command = ('/hbase/bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool '
+                                     'hbase.master.info.port')
+    container_id = client.create_container(image=image, command=get_hbase_web_ui_port_command,
+                                           environment={'JAVA_HOME': '/java'})['Id']
+    logger.debug("Created temporary container (id: %s) from which to copy configuration files.",
+                 container_id)
+
+    # Actually do the copying of Hadoop configs...
+    _copy_container_folder_to_host(container_id, '/hadoop/etc/hadoop',
+                                   join(container_cluster_config_dir, 'hadoop'),
+                                   join(host_cluster_config_dir, 'hadoop'))
+
+    # ... and repeat for HBase configs.
+    _copy_container_folder_to_host(container_id, '/hbase/conf',
+                                   join(container_cluster_config_dir, 'hbase'),
+                                   join(host_cluster_config_dir, 'hbase'))
+
+    logger.info("The /hbase/lib folder on containers in the cluster will be volume mounted "
+                "into %s...", join(host_cluster_config_dir, 'hbase-lib'))
+    _copy_container_folder_to_host(container_id, '/hbase/lib',
+                                   join(container_cluster_config_dir, 'hbase-lib'),
+                                   join(host_cluster_config_dir, 'hbase-lib'))
+
+    # Every node in the cluster will have a shared volume mount from the host for Hadoop and HBase
+    # configuration files as well as the HBase lib folder.
+    shared_volumes = [{join(host_cluster_config_dir, 'hadoop'): '/hadoop/etc/hadoop'},
+                      {join(host_cluster_config_dir, 'hbase'): '/hbase/conf'},
+                      {join(host_cluster_config_dir, 'hbase-lib'): '/hbase/lib'}]
+
+    # Get the HBase master web UI port, stripping the newline the Docker REST API gives us.
+    client.start(container=container_id)
+    if client.wait(container=container_id) == EX_OK:
+        hbase_master_web_ui_port = client.logs(container=container_id).rstrip()
+        client.remove_container(container=container_id, force=True)
+    else:
+        raise Exception('Failed to remove HBase configuration container.')
+
+    # Create the Node objects. These hold the state of our container nodes and will be started
+    # at Cluster instantiation time.
+    primary_node = Node(hostname=args.primary_node[0], network=args.network,
+                        image=image, ports=[NAMENODE_WEB_UI_PORT,
+                                            hbase_master_web_ui_port,
+                                            RESOURCEMANAGER_WEB_UI_PORT,
+                                            HBASE_REST_SERVER_PORT],
+                        volumes=shared_volumes)
+    secondary_nodes = []
+    for hostname in args.secondary_nodes:
+        # A list of service directories will be used to name folders on the host and, appended
+        # with an index, in the container, as well (e.g. /data1/node-1/dfs:/dfs1).
+        service_directories = ['dfs', 'yarn']
+
+        # Every Node will have shared_volumes to let one set of configs on the host be propagated
+        # to every container. If --data-directories is specified, this will be appended to allow
+        # containers to use multiple disks on the host.
+        volumes = shared_volumes[:]
+        if args.data_directories:
+            data_directories = args.data_directories.split(',')
+            volumes += [{join(data_directory, uuid, hostname, service_directory):
+                             "/{0}{1}".format(service_directory, i)}
+                        for i, data_directory in enumerate(data_directories, start=1)
+                        for service_directory in service_directories]
+        secondary_nodes.append(Node(hostname=hostname,
+                                    network=args.network,
+                                    image=image,
+                                    volumes=volumes))
+
+    Cluster(topology='apache_hbase',
+            node_groups=[NodeGroup(name='primary', nodes=[primary_node]),
+                         NodeGroup(name='secondary', nodes=secondary_nodes)],
+            network_name=args.network).start()
+
+    # When creating configs, pass in a dictionary of wildcards into create_configurations_from_file
+    # to transform placeholders in the configurations.cfg file into real values.
+    _create_configs_from_file(filename=args.configurations,
+                              cluster_config_dir=container_cluster_config_dir,
+                              wildcards={"primary_node": args.primary_node,
+                                         "secondary_nodes": args.secondary_nodes,
+                                         "all_nodes": args.primary_node + args.secondary_nodes,
+                                         "network": args.network})
+
+    # After creating configurations from the configurations.cfg file, update hdfs-site.xml and
+    # yarn-site.xml to use the data directories passed on the command line.
+    if args.data_directories:
+        _update_config_for_data_dirs(
+            container_cluster_config_dir=container_cluster_config_dir,
+            data_directories=data_directories
+        )
+
+    if not args.dont_start_services:
+        _start_services(primary_node, hbase_master_web_ui_port=hbase_master_web_ui_port)
+
+def _update_config_for_data_dirs(container_cluster_config_dir, data_directories):
+    logger.info('Updating dfs.datanode.data.dir in hdfs-site.xml...')
+    hdfs_site_xml_filename = join(container_cluster_config_dir, 'hadoop', 'hdfs-site.xml')
+    hdfs_site_xml = XmlConfiguration(
+        properties={'dfs.datanode.data.dir':
+                    ','.join(["/dfs{0}".format(i)
+                              for i, _ in enumerate(data_directories, start=1)])},
+        source_file=hdfs_site_xml_filename
+    )
+    hdfs_site_xml.write_to_file(filename=hdfs_site_xml_filename)
+
+    logger.info('Updating yarn.nodemanager.local-dirs in yarn-site.xml...')
+    yarn_site_xml_filename = join(container_cluster_config_dir, 'hadoop', 'yarn-site.xml')
+    yarn_site_xml = XmlConfiguration(
+        properties={'yarn.nodemanager.local-dirs':
+                    ','.join(["/yarn{0}".format(i)
+                              for i, _ in enumerate(data_directories, start=1)])},
+        source_file=yarn_site_xml_filename
+    )
+    yarn_site_xml.write_to_file(filename=yarn_site_xml_filename)
+
+def _start_services(primary_node, **kwargs):
+    logger.info("Formatting namenode on %s...", primary_node.fqdn)
+    primary_node.ssh('hdfs namenode -format')
+
+    logger.info("Starting HDFS...")
+    primary_node.ssh('/hadoop/sbin/start-dfs.sh')
+
+    logger.info("Starting YARN...")
+    primary_node.ssh('/hadoop/sbin/start-yarn.sh')
+
+    logger.info('Starting HBase...')
+    primary_node.ssh('/hbase/bin/start-hbase.sh')
+    primary_node.ssh('/hbase/bin/hbase-daemon.sh start rest')
+
+    logger.info("NameNode and HBase master are located on %s. SSH over and have fun!",
+                primary_node.hostname)
+
+    logger.info("The HDFS NameNode web UI can be reached at http://%s:%s",
+                getfqdn(), get_host_port_binding(primary_node.container_id,
+                                                 NAMENODE_WEB_UI_PORT))
+
+    logger.info("The YARN ResourceManager web UI can be reached at http://%s:%s",
+                getfqdn(), get_host_port_binding(primary_node.container_id,
+                                                 RESOURCEMANAGER_WEB_UI_PORT))
+
+    logger.info("The HBase master web UI can be reached at http://%s:%s",
+                getfqdn(), get_host_port_binding(primary_node.container_id,
+                                                 kwargs.get('hbase_master_web_ui_port')))
+
+    logger.info("The HBase REST server can be reached at http://%s:%s",
+                getfqdn(), get_host_port_binding(primary_node.container_id,
+                                                 HBASE_REST_SERVER_PORT))
+
+def build(args):
+    """This function will be executed when ./bin/build_cluster apache_hbase is invoked."""
+
+    # pylint: disable=too-many-locals
+    # See start function above for rationale for disabling this warning.
+
+    container_build_dir = join(CLUSTERDOCK_VOLUME, str(uuid4()))
+    makedirs(container_build_dir)
+
+    # If --hbase-git-commit is specified, we build HBase from source.
+    if args.hbase_git_commit:
+        build_hbase_commands = [
+            "git clone https://github.com/apache/hbase.git {0}".format(container_build_dir),
+            "git -C {0} checkout {1}".format(container_build_dir, args.hbase_git_commit),
+            "mvn --batch-mode clean install -DskipTests assembly:single -f {0}/pom.xml".format(
+                container_build_dir
+            )
+        ]
+
+        maven_image = Constants.docker_images.maven # pylint: disable=no-member
+        if not is_image_available_locally(maven_image):
+            pull_image(maven_image)
+
+        container_configs = {
+            'command': 'bash -c "{0}"'.format(' && '.join(build_hbase_commands)),
+            'image': maven_image,
+            'host_config': client.create_host_config(volumes_from=get_clusterdock_container_id())
+        }
+
+        maven_container_id = client.create_container(**container_configs)['Id']
+        client.start(container=maven_container_id)
+        for line in client.logs(container=maven_container_id, stream=True):
+            stdout.write(line)
+            stdout.flush()
+
+        # Mimic docker run --rm by blocking on docker wait and then removing the container
+        # if it encountered no errors.
+        if client.wait(container=maven_container_id) == EX_OK:
+            client.remove_container(container=maven_container_id, force=True)
+        else:
+            raise Exception('Error encountered while building HBase.')
+
+        assembly_target_dir = join(container_build_dir, 'hbase-assembly', 'target')
+        for a_file in listdir(assembly_target_dir):
+            if a_file.endswith('bin.tar.gz'):
+                args.hbase_tarball = join(assembly_target_dir, a_file)
+                break
+
+    # Download all the binary tarballs into our temporary directory so that we can add them
+    # into the Docker image we're building.
+    filenames = []
+    for tarball_location in [args.java_tarball, args.hadoop_tarball, args.hbase_tarball]:
+        tarball_filename = tarball_location.rsplit('/', 1)[-1]
+        filenames.append(tarball_filename)
+
+        # Download tarballs given as URLs.
+        if container_build_dir not in tarball_location:
+            get_request = requests.get(tarball_location, stream=True, cookies=(
+                {'oraclelicense': 'accept-securebackup-cookie'}
+                if tarball_location == args.java_tarball
+                else None
+            ))
+            # Raise Exception if download failed.
+            get_request.raise_for_status()
+            logger.info("Downloading %s...", tarball_filename)
+            with open(join(container_build_dir, tarball_filename), 'wb') as file_descriptor:
+                for chunk in get_request.iter_content(1024):
+                    file_descriptor.write(chunk)
+        else:
+            move(tarball_location, container_build_dir)
+
+    dockerfile_contents = r"""
+    FROM {nodebase_image}
+    COPY {java_tarball} /tarballs/
+    RUN mkdir /java && tar -xf /tarballs/{java_tarball} -C /java --strip-components=1
+    RUN echo "JAVA_HOME=/java" >> /etc/environment
+
+    COPY {hadoop_tarball} /tarballs/
+    RUN mkdir /hadoop && tar -xf /tarballs/{hadoop_tarball} -C /hadoop --strip-components=1
+    COPY {hbase_tarball} /tarballs/
+    RUN mkdir /hbase && tar -xf /tarballs/{hbase_tarball} -C /hbase --strip-components=1
+
+    # Remove tarballs folder.
+    RUN rm -rf /tarballs
+
+    # Set PATH explicitly.
+    RUN echo "PATH=/java/bin:/hadoop/bin:/hbase/bin/:$(echo $PATH)" >> /etc/environment
+
+    # Add hbase user and group before copying root's SSH keys over.
+    RUN groupadd hbase \
+        && useradd -g hbase hbase \
+        && cp -R /root/.ssh ~hbase \
+        && chown -R hbase:hbase ~hbase/.ssh
+
+    # Disable requiretty in /etc/sudoers as required by HBase chaos monkey.
+    RUN sed -i 's/Defaults\s*requiretty/#&/' /etc/sudoers
+    """.format(nodebase_image='/'.join([item
+                                        for item in [args.registry_url,
+                                                     args.namespace or DEFAULT_APACHE_NAMESPACE,
+                                                     "clusterdock:{os}_nodebase".format(
+                                                         os=args.operating_system
+                                                     )]
+                                        if item]),
+               java_tarball=filenames[0], hadoop_tarball=filenames[1], hbase_tarball=filenames[2])
+
+    logger.info("Created Dockerfile: %s", dockerfile_contents)
+
+    with open(join(container_build_dir, 'Dockerfile'), 'w') as dockerfile:
+        dockerfile.write(dockerfile_contents)
+
+    image = '/'.join(
+        [item
+         for item in [args.registry_url, args.namespace or DEFAULT_APACHE_NAMESPACE,
+                      "clusterdock:{os}_java-{java}_hadoop-{hadoop}_hbase-{hbase}".format(
+                          os=args.operating_system, java=args.java_version,
+                          hadoop=args.hadoop_version, hbase=args.hbase_version
+                      )]
+         if item])
+
+    logger.info("Building image %s...", image)
+    build_image(dockerfile=join(container_build_dir, 'Dockerfile'), tag=image)
+
+    logger.info("Removing build temporary directory...")
+    return [image]
diff --git a/dev-support/apache_hbase_topology/configurations.cfg b/dev-support/apache_hbase_topology/configurations.cfg
new file mode 100644
index 0000000..f28995e
--- /dev/null
+++ b/dev-support/apache_hbase_topology/configurations.cfg
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+# This configuration file is used by the apache-hbase clusterdock topology to populate configuration
+# files for an HBase cluster. Section names denote the filenames (e.g. [hadoop/core-site.xml]) with
+# the section's corresponding items denoting properties.
+#
+# Filenames ending with .xml will have items run through an XML parser. That is,
+#
+# [hbase/hbase-site.xml]
+# hbase.cluster.distributed = true
+#
+# would lead to the creation of an hbase-site.xml file containing:
+#
+# <property>
+#   <name>hbase.cluster.distributed</name>
+#   <value>true</value>
+# </property>
+#
+# Note, also, that items in non-xml files can be copied verbatim by with using the "body:" item (be
+# sure to include leading whitespace must be used for the following lines). For example:
+#
+# [hbase/hbase-env.sh]
+# body:
+#     COMMON_HBASE_OPTS="$COMMON_HBASE_OPTS -XX:+UseG1GC"
+#     COMMON_HBASE_OPTS="$COMMON_HBASE_OPTS -XX:+PrintGCDetails"
+#
+# would result in an hbase-env.sh file with:
+#
+#     COMMON_HBASE_OPTS="$COMMON_HBASE_OPTS -XX:+UseG1GC"
+#     COMMON_HBASE_OPTS="$COMMON_HBASE_OPTS -XX:+PrintGCDetails"
+#
+# Two last notes:
+# 1. Items starting with +++ will be eval'd with Python directly.
+# 2. As defined in action.py, some wildcards are processed at cluster start time (e.g. {network}).
+
+[hadoop/slaves]
++++ '\n'.join(["{{0}}.{network}".format(node) for node in {secondary_nodes}])
+
+[hadoop/core-site.xml]
+fs.default.name = hdfs://{primary_node[0]}.{network}:8020
+
+[hadoop/mapred-site.xml]
+mapreduce.framework.name = yarn
+
+[hadoop/yarn-site.xml]
+yarn.resourcemanager.hostname = {primary_node[0]}.{network}
+yarn.nodemanager.aux-services = mapreduce_shuffle
+yarn.nodemanager.aux-services.mapreduce_shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
+yarn.nodemanager.vmem-check-enabled = false
+
+[hbase/regionservers]
++++ '\n'.join(["{{0}}.{network}".format(node) for node in {secondary_nodes}])
+
+[hbase/backup-masters]
+{secondary_nodes[0]}.{network}
+
+[hbase/hbase-site.xml]
+hbase.cluster.distributed = true
+hbase.rootdir = hdfs://{primary_node[0]}.{network}/hbase
+hbase.zookeeper.quorum = {primary_node[0]}.{network}
+hbase.zookeeper.property.dataDir = /usr/local/zookeeper
+
+# For now, set service users for Chaos Monkey to be root.
+hbase.it.clustermanager.hadoop.hdfs.user = root
+hbase.it.clustermanager.zookeeper.user = root
+hbase.it.clustermanager.hbase.user = root
diff --git a/dev-support/apache_hbase_topology/profile.cfg b/dev-support/apache_hbase_topology/profile.cfg
new file mode 100644
index 0000000..e6937ff
--- /dev/null
+++ b/dev-support/apache_hbase_topology/profile.cfg
@@ -0,0 +1,82 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+# This configuration file is used to define some general properties of the apache-hbase clusterdock
+# topology, as well as its command line.
+
+[general]
+name = Apache HBase
+description = An Apache HBase cluster with 1 primary node and n-1 secondary nodes
+
+
+[node_groups]
+# Define node groups and specify which of each to start during the build process.
+primary-node = node-1
+secondary-nodes = node-2
+
+
+[build]
+arg.java-tarball = http://download.oracle.com/otn-pub/java/jdk/8u91-b14/jdk-8u91-linux-x64.tar.gz
+arg.java-tarball.help = The URL (or filename) of the Java tarball to install on the cluster
+arg.java-tarball.metavar = tarball
+
+arg.java-version = 8u91
+arg.java-version.help = The label to use when identifying the version of Java
+arg.java-version.metavar = ver
+
+arg.hadoop-tarball = https://archive.apache.org/dist/hadoop/core/hadoop-2.7.2/hadoop-2.7.2.tar.gz
+arg.hadoop-tarball.help = The URL (or filename) of the Hadoop tarball to install on the cluster
+arg.hadoop-tarball.metavar = tarball
+
+arg.hadoop-version = 2.7.2
+arg.hadoop-version.help = The label to use when identifying the version of Hadoop
+arg.hadoop-version.metavar = ver
+
+arg.hbase-tarball
+arg.hbase-tarball.help = The URL (or filename) of the HBase tarball to install on the cluster
+arg.hbase-tarball.metavar = tarball
+
+arg.hbase-version
+arg.hbase-version.help = The label to use when identifying the version of HBase
+arg.hbase-version.metavar = ver
+
+arg.hbase-git-commit
+arg.hbase-git-commit.help = The git commit to checkout when building the image
+arg.hbase-git-commit.metavar = commit
+
+[start]
+arg.java-version = 8u91
+arg.java-version.help = The Java version on the cluster
+arg.java-version.metavar = ver
+
+arg.hadoop-version = 2.7.2
+arg.hadoop-version.help = The Hadoop version on the cluster
+arg.hadoop-version.metavar = ver
+
+arg.hbase-version = master
+arg.hbase-version.help = The HBase version on the cluster
+arg.hbase-version.metavar = ver
+
+arg.configurations = /root/clusterdock/clusterdock/topologies/apache_hbase/configurations.cfg
+arg.configurations.help = Location of a configurations.cfg file to apply to the cluster
+arg.configurations.metavar = path
+
+arg.data-directories
+arg.data-directories.help = A comma-separated list of host directories on which to save data (i.e. for HDFS and YARN)
+arg.data-directories.metavar = dir
+
+arg.dont-start-services = False
+arg.dont-start-services.help = Don't start Hadoop and HBase services as part of cluster start
diff --git a/dev-support/apache_hbase_topology/ssh/id_rsa b/dev-support/apache_hbase_topology/ssh/id_rsa
new file mode 100644
index 0000000..7c9d31e
--- /dev/null
+++ b/dev-support/apache_hbase_topology/ssh/id_rsa
@@ -0,0 +1,44 @@
+#/**
+# * Licensed to the Apache Software Foundation (ASF) under one
+# * or more contributor license agreements.  See the NOTICE file
+# * distributed with this work for additional information
+# * regarding copyright ownership.  The ASF licenses this file
+# * to you under the Apache License, Version 2.0 (the
+# * "License"); you may not use this file except in compliance
+# * with the License.  You may obtain a copy of the License at
+# *
+# *     http://www.apache.org/licenses/LICENSE-2.0
+# *
+# * Unless required by applicable law or agreed to in writing, software
+# * distributed under the License is distributed on an "AS IS" BASIS,
+# * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# * See the License for the specific language governing permissions and
+# * limitations under the License.
+# */
+-----BEGIN RSA PRIVATE KEY-----
+MIIEpAIBAAKCAQEAtj/yZZNF+bv26eqWqsx+vFehSlxBJp/QhIrFWKpjHcpQJ29o
+6hJN9moU3Goft2C5w6FoZdjC1TWlzhxUnRS8xeksFnW3ItvkjySLA1Iq6jchYxNd
+fZ3HwTdH0rubM1uJ/CnkaijoxBqGBmPSL0TxfqcteaJ8APhslwl0WWJ6b+tBCHDV
+pTLATebtggCAfhKmSuAYmn3QIqJ7DFoSGkwhkxpUHuuVCZxUH3CIxLIw+6npArr/
+S5gtFo50oi6FXPmvv6mJg6yLqq3VlKcQh6d/COaJopHn+nLed2kECESUlpTruMpr
+6IcGESgz4hnkmhop8oTY42sQJhPPF2Ahq9a3aQIDAQABAoIBADSbxPb5SkvKrH3d
+j9yB51uq2A5FDzF9FI4OGOV9WdsxmW2oxVo8KnElMhxmLf2bWERWhXJQ3fz53YDf
+wLUPVWaz5lwdYt4XJ6UCYXZ185lkjKiy4FvwfccSlBMKwMRUekJmPV8/q+Ff3qxd
+iEDI4AU1cPUZqD4HeCEpQ4LB4KIJhCdLkCgoWxxaCwwuB6DnwB4P4VLeAfsm2NEX
+k9dld87q/miOmuw9QsmSv9wYiQqoPdV5Qj7KYqHBAa6syqUfFni3Ibmw1WBzMydp
+8YyP9HvrzDBMnPPzkmp6od0fAgGafIlkIxz/9sCKOSISnuuqahbNAJK/rIiJzLY3
+Pi49M+ECgYEA2vCFObmM/hpKUPNxG841nQScFfeMg1q1z1krEmfjqDTELXyq9AOS
+fGiVTxfWagOGoAWIwg3ZfgGEmxsKrOSxkFvabWdhN1Wn98Zf8fJG8YAwLYg8JOgf
+gZ5pkxFW4FwrAeFDyJyKvFJVrbDw1PM41yvTmRzf3NjcaqJrBE2fgKUCgYEA1RmF
+XjfMlBoMLZ4pXS1zF91WgOF4SNdJJj9RCGzqqdy+DzPKNAZVa0HBhaIHCZXL3Hcv
+zqgEb6RSMysyVYjZPwfSwevsSuxpfriVpYux5MN3AEXX5Ysv51WWStWgt9+iQlfo
+xAdxxukOa++PZ4Z+TIIEDAFS47rnKEQUh+ZNfHUCgYEA0amTa3wtcQlsMalvn9kR
+rpRDhSXTAddUVIRnovCqKuKdG5JPg+4H0eu1UFDbnBpUSdoC5RKuPOTnQEHdL0Sy
+ZjQQMMTXbE4y1Cy8pM4G8i539KKKNi20PkSdhaENOT4KUXqPlwWSNlYChprzhnqE
+7EmkEPR9zNg//D4djbloDaECgYANOJIfsFKO9ba/tcpXL5SubFsLj/GIg2LUbqU2
+YpuEgl+ATfRDmgj+qIu7ILxTCeol+XcL2Ty9OHKpHgr3Z5Ai6vdWdK6qT1SUOht+
+s9YLnVzqtWqZoTMNpS+34N0hy0wj1ZRpZRTYBGmSpMA+6gc38/EQVZyw6E2jH+Yu
+MEmqaQKBgQDGh9uXCl/WjhBOF9VLrX/Aeaa2Mzrh21Ic1dw6aWrE4EW6k1LvSP36
+evrvrs2jQuzRMGH6DKX8ImnVEWjK+gZfgf2MuyDSW7KYR5zxkdZtRkotF6X0fu6N
+8uLa7CN8UmS4FiAMLwNbTJ6zA6ohny7r+AiOqNGlP9vBFMhpGs3NFg==
+-----END RSA PRIVATE KEY-----
diff --git a/dev-support/apache_hbase_topology/ssh/id_rsa.pub b/dev-support/apache_hbase_topology/ssh/id_rsa.pub
new file mode 100644
index 0000000..0dca44a
--- /dev/null
+++ b/dev-support/apache_hbase_topology/ssh/id_rsa.pub
@@ -0,0 +1,18 @@
+#/**
+# * Licensed to the Apache Software Foundation (ASF) under one
+# * or more contributor license agreements.  See the NOTICE file
+# * distributed with this work for additional information
+# * regarding copyright ownership.  The ASF licenses this file
+# * to you under the Apache License, Version 2.0 (the
+# * "License"); you may not use this file except in compliance
+# * with the License.  You may obtain a copy of the License at
+# *
+# *     http://www.apache.org/licenses/LICENSE-2.0
+# *
+# * Unless required by applicable law or agreed to in writing, software
+# * distributed under the License is distributed on an "AS IS" BASIS,
+# * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# * See the License for the specific language governing permissions and
+# * limitations under the License.
+# */
+ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC2P/Jlk0X5u/bp6paqzH68V6FKXEEmn9CEisVYqmMdylAnb2jqEk32ahTcah+3YLnDoWhl2MLVNaXOHFSdFLzF6SwWdbci2+SPJIsDUirqNyFjE119ncfBN0fSu5szW4n8KeRqKOjEGoYGY9IvRPF+py15onwA+GyXCXRZYnpv60EIcNWlMsBN5u2CAIB+EqZK4BiafdAionsMWhIaTCGTGlQe65UJnFQfcIjEsjD7qekCuv9LmC0WjnSiLoVc+a+/qYmDrIuqrdWUpxCHp38I5omikef6ct53aQQIRJSWlOu4ymvohwYRKDPiGeSaGinyhNjjaxAmE88XYCGr1rdp clusterdock