You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@knox.apache.org by lm...@apache.org on 2016/08/03 19:44:13 UTC

svn commit: r1755109 [4/5] - /knox/site/books/knox-0-9-1/

Added: knox/site/books/knox-0-9-1/user-guide.html
URL: http://svn.apache.org/viewvc/knox/site/books/knox-0-9-1/user-guide.html?rev=1755109&view=auto
==============================================================================
--- knox/site/books/knox-0-9-1/user-guide.html (added)
+++ knox/site/books/knox-0-9-1/user-guide.html Wed Aug  3 19:44:13 2016
@@ -0,0 +1,5009 @@
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+--><p><link href="book.css" rel="stylesheet"/></p><p><img src="knox-logo.gif" alt="Knox"/> <!-- <img src="apache-logo.gif" alt="Apache"/> --> <img src="apache-logo.gif" align="right" alt="Apache"/></p><h1><a id="Apache+Knox+Gateway+0.9.x+User's+Guide">Apache Knox Gateway 0.9.x User&rsquo;s Guide</a> <a href="#Apache+Knox+Gateway+0.9.x+User's+Guide"><img src="markbook-section-link.png"/></a></h1><h2><a id="Table+Of+Contents">Table Of Contents</a> <a href="#Table+Of+Contents"><img src="markbook-section-link.png"/></a></h2>
+<ul>
+  <li><a href="#Introduction">Introduction</a></li>
+  <li><a href="#Quick+Start">Quick Start</a></li>
+  <li><a href="#Gateway+Samples">Gateway Samples</a></li>
+  <li><a href="#Apache+Knox+Details">Apache Knox Details</a>
+  <ul>
+    <li><a href="#Apache+Knox+Directory+Layout">Apache Knox Directory Layout</a></li>
+    <li><a href="#Supported+Services">Supported Services</a></li>
+  </ul></li>
+  <li><a href="#Gateway+Details">Gateway Details</a>
+  <ul>
+    <li><a href="#URL+Mapping">URL Mapping</a></li>
+    <li><a href="#Configuration">Configuration</a></li>
+    <li><a href="#Knox+CLI">Knox CLI</a></li>
+    <li><a href="#Admin+API">Admin API</a></li>
+    <li><a href="#X-Forwarded-*+Headers+Support">X-Forwarded-* Headers Support</a></li>
+    <li><a href="#Authentication">Authentication</a></li>
+    <li><a href="#Advanced+LDAP+Authentication">Advanced LDAP Authentication</a></li>
+    <li><a href="#LDAP+Authentication+Caching">LDAP Authentication Caching</a></li>
+    <li><a href="#LDAP+Group+Lookup">LDAP Group Lookup</a></li>
+    <li><a href="#Identity+Assertion">Identity Assertion</a></li>
+    <li><a href="#Authorization">Authorization</a></li>
+    <li><a href="#Secure+Clusters">Secure Clusters</a></li>
+    <li><a href="#High+Availability">High Availability</a></li>
+    <li><a href="#Web+App+Security+Provider">Web App Security Provider</a>
+    <ul>
+      <li><a href="#CSRF">CSRF</a></li>
+      <li><a href="#CORS">CORS</a></li>
+      <li><a href="#X-Frame-Options">X-Frame-Options</a></li>
+    </ul></li>
+    <li><a href="#Preauthenticated+SSO+Provider">Preauthenticated SSO Provider</a></li>
+    <li><a href="#Pac4j+Provider+-+CAS+/+OAuth+/+SAML+/+OpenID+Connect">Pac4j Provider - CAS / OAuth / SAML / OpenID Connect</a></li>
+    <li><a href="#KnoxSSO+Setup+and+Configuration">KnoxSSO Setup and Configuration</a></li>
+    <li><a href="#Mutual+Authentication+with+SSL">Mutual Authentication with SSL</a></li>
+    <li><a href="#Audit">Audit</a></li>
+  </ul></li>
+  <li><a href="#Client+Details">Client Details</a></li>
+  <li><a href="#Service+Details">Service Details</a>
+  <ul>
+    <li><a href="#WebHDFS">WebHDFS</a></li>
+    <li><a href="#WebHCat">WebHCat</a></li>
+    <li><a href="#Oozie">Oozie</a></li>
+    <li><a href="#HBase">HBase</a></li>
+    <li><a href="#Hive">Hive</a></li>
+    <li><a href="#Yarn">Yarn</a></li>
+    <li><a href="#Storm">Storm</a></li>
+    <li><a href="#Common+Service+Config">Common Service Config</a></li>
+    <li><a href="#Default+Service+HA+support">Default Service HA support</a></li>
+  </ul></li>
+  <li><a href="#UI+Service+Details">UI Service Details</a></li>
+  <li><a href="#Limitations">Limitations</a></li>
+  <li><a href="#Troubleshooting">Troubleshooting</a></li>
+  <li><a href="#Export+Controls">Export Controls</a></li>
+</ul><h2><a id="Introduction">Introduction</a> <a href="#Introduction"><img src="markbook-section-link.png"/></a></h2><p>The Apache Knox Gateway is a system that provides a single point of authentication and access for Apache Hadoop services in a cluster. The goal is to simplify Hadoop security for both users (i.e. who access the cluster data and execute jobs) and operators (i.e. who control access and manage the cluster). The gateway runs as a server (or cluster of servers) that provide centralized access to one or more Hadoop clusters. In general the goals of the gateway are as follows:</p>
+<ul>
+  <li>Provide perimeter security for Hadoop REST APIs to make Hadoop security easier to setup and use
+  <ul>
+    <li>Provide authentication and token verification at the perimeter</li>
+    <li>Enable authentication integration with enterprise and cloud identity management systems</li>
+    <li>Provide service level authorization at the perimeter</li>
+  </ul></li>
+  <li>Expose a single URL hierarchy that aggregates REST APIs of a Hadoop cluster
+  <ul>
+    <li>Limit the network endpoints (and therefore firewall holes) required to access a Hadoop cluster</li>
+    <li>Hide the internal Hadoop cluster topology from potential attackers</li>
+  </ul></li>
+</ul><h2><a id="Quick+Start">Quick Start</a> <a href="#Quick+Start"><img src="markbook-section-link.png"/></a></h2><p>Here are the steps to have Apache Knox up and running against a Hadoop Cluster:</p>
+<ol>
+  <li>Verify system requirements</li>
+  <li>Download a virtual machine (VM) with Hadoop</li>
+  <li>Download Apache Knox Gateway</li>
+  <li>Start the virtual machine with Hadoop</li>
+  <li>Install Knox</li>
+  <li>Start the LDAP embedded within Knox</li>
+  <li>Start the Knox Gateway</li>
+  <li>Do Hadoop with Knox</li>
+</ol><h3><a id="1+-+Requirements">1 - Requirements</a> <a href="#1+-+Requirements"><img src="markbook-section-link.png"/></a></h3><h4><a id="Java">Java</a> <a href="#Java"><img src="markbook-section-link.png"/></a></h4><p>Java 1.6 or later is required for the Knox Gateway runtime. Use the command below to check the version of Java installed on the system where Knox will be running.</p>
+<pre><code>java -version
+</code></pre><h4><a id="Hadoop">Hadoop</a> <a href="#Hadoop"><img src="markbook-section-link.png"/></a></h4><p>Knox 0.9.1 supports Hadoop 2.x, the quick start instructions assume a Hadoop 2.x virtual machine based environment.</p><h3><a id="2+-+Download+Hadoop+2.x+VM">2 - Download Hadoop 2.x VM</a> <a href="#2+-+Download+Hadoop+2.x+VM"><img src="markbook-section-link.png"/></a></h3><p>The quick start provides a link to download Hadoop 2.0 based Hortonworks virtual machine <a href="http://hortonworks.com/products/hdp-2/#install">Sandbox</a>. Please note Knox supports other Hadoop distributions and is configurable against a full-blown Hadoop cluster. Configuring Knox for Hadoop 2.x version, or Hadoop deployed in EC2 or a custom Hadoop cluster is documented in advance deployment guide.</p><h3><a id="3+-+Download+Apache+Knox+Gateway">3 - Download Apache Knox Gateway</a> <a href="#3+-+Download+Apache+Knox+Gateway"><img src="markbook-section-link.png"/></a></h3><p>Download one of the dist
 ributions below from the <a href="http://www.apache.org/dyn/closer.cgi/knox">Apache mirrors</a>.</p>
+<ul>
+  <li>Source archive: <a href="http://www.apache.org/dyn/closer.cgi/knox/0.9.1/knox-0.9.1-src.zip">knox-0.9.1-src.zip</a> (<a href="http://www.apache.org/dist/knox/0.9.1/knox-0.9.1-src.zip.asc">PGP signature</a>, <a href="http://www.apache.org/dist/knox/0.9.1/knox-0.9.1-src.zip.sha">SHA1 digest</a>, <a href="http://www.apache.org/dist/knox/0.9.1/knox-0.9.1-src.zip.md5">MD5 digest</a>)</li>
+  <li>Binary archive: <a href="http://www.apache.org/dyn/closer.cgi/knox/0.9.1/knox-0.9.1.zip">knox-0.9.1.zip</a> (<a href="http://www.apache.org/dist/knox/0.9.1/knox-0.9.1.zip.asc">PGP signature</a>, <a href="http://www.apache.org/dist/knox/0.9.1/knox-0.9.1.zip.sha">SHA1 digest</a>, <a href="http://www.apache.org/dist/knox/0.9.1/knox-0.9.1.zip.md5">MD5 digest</a>)</li>
+</ul><p>Apache Knox Gateway releases are available under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>. See the NOTICE file contained in each release artifact for applicable copyright attribution notices.</p><h3><a id="Verify">Verify</a> <a href="#Verify"><img src="markbook-section-link.png"/></a></h3><p>While recommended, verify is an optional step. You can verify the integrity of any downloaded files using the PGP signatures. Please read <a href="http://httpd.apache.org/dev/verification.html">Verifying Apache HTTP Server Releases</a> for more information on why you should verify our releases.</p><p>The PGP signatures can be verified using PGP or GPG. First download the <a href="https://dist.apache.org/repos/dist/release/knox/KEYS">KEYS</a> file as well as the .asc signature files for the relevant release packages. Make sure you get these files from the main distribution directory linked above, rather than from a mirror. Then verify the si
 gnatures using one of the methods below.</p>
+<pre><code>% pgpk -a KEYS
+% pgpv knox-0.9.1.zip.asc
+</code></pre><p>or</p>
+<pre><code>% pgp -ka KEYS
+% pgp knox-0.9.1.zip.asc
+</code></pre><p>or</p>
+<pre><code>% gpg --import KEYS
+% gpg --verify knox-0.9.1.zip.asc
+</code></pre><h3><a id="4+-+Start+Hadoop+virtual+machine">4 - Start Hadoop virtual machine</a> <a href="#4+-+Start+Hadoop+virtual+machine"><img src="markbook-section-link.png"/></a></h3><p>Start the Hadoop virtual machine.</p><h3><a id="5+-+Install+Knox">5 - Install Knox</a> <a href="#5+-+Install+Knox"><img src="markbook-section-link.png"/></a></h3><p>The steps required to install the gateway will vary depending upon which distribution format (zip | rpm) was downloaded. In either case you will end up with a directory where the gateway is installed. This directory will be referred to as your <code>{GATEWAY_HOME}</code> throughout this document.</p><h4><a id="ZIP">ZIP</a> <a href="#ZIP"><img src="markbook-section-link.png"/></a></h4><p>If you downloaded the Zip distribution you can simply extract the contents into a directory. The example below provides a command that can be executed to do this. Note the <code>{VERSION}</code> portion of the command must be replaced with an actual Apa
 che Knox Gateway version number. This might be 0.9.1 for example.</p>
+<pre><code>unzip knox-{VERSION}.zip
+</code></pre><p>This will create a directory <code>knox-{VERSION}</code> in your current directory. The directory <code>knox-{VERSION}</code> will considered your <code>{GATEWAY_HOME}</code></p><h3><a id="6+-+Start+LDAP+embedded+in+Knox">6 - Start LDAP embedded in Knox</a> <a href="#6+-+Start+LDAP+embedded+in+Knox"><img src="markbook-section-link.png"/></a></h3><p>Knox comes with an LDAP server for demonstration purposes. Note: If the tool used to extract the contents of the Tar or tar.gz file was not capable of making the files in the bin directory executable</p>
+<pre><code>cd {GATEWAY_HOME}
+bin/ldap.sh start
+</code></pre><h3><a id="7+-+Create+the+Master+Secret">7 - Create the Master Secret</a> <a href="#7+-+Create+the+Master+Secret"><img src="markbook-section-link.png"/></a></h3><p>Run the knoxcli create-master command in order to persist the master secret that is used to protect the key and credential stores for the gateway instance.</p>
+<pre><code>cd {GATEWAY_HOME}
+bin/knoxcli.sh create-master
+</code></pre><p>The cli will prompt you for the master secret (i.e. password).</p><h3><a id="7+-+Start+Knox">7 - Start Knox</a> <a href="#7+-+Start+Knox"><img src="markbook-section-link.png"/></a></h3><p>The gateway can be started using the provided shell script.</p><p>The server will discover the persisted master secret during start up and complete the setup process for demo installs. A demo install will consist of a knox gateway instance with an identity certificate for localhost. This will require clients to be on the same machine or to turn off hostname verification. For more involved deployments, See the Knox CLI section of this document for additional configuration options, including the ability to create a self-signed certificate for a specific hostname.</p>
+<pre><code>cd {GATEWAY_HOME}
+bin/gateway.sh start
+</code></pre><p>When starting the gateway this way the process will be run in the background. The log files will be written to {GATEWAY_HOME}/logs and the process ID files (PIDS) will b written to {GATEWAY_HOME}/pids.</p><p>In order to stop a gateway that was started with the script use this command.</p>
+<pre><code>cd {GATEWAY_HOME}
+bin/gateway.sh stop
+</code></pre><p>If for some reason the gateway is stopped other than by using the command above you may need to clear the tracking PID.</p>
+<pre><code>cd {GATEWAY_HOME}
+bin/gateway.sh clean
+</code></pre><p><strong>NOTE: This command will also clear any .out and .err file from the {GATEWAY_HOME}/logs directory so use this with caution.</strong></p><h3><a id="8+-+Do+Hadoop+with+Knox">8 - Do Hadoop with Knox</a> <a href="#8+-+Do+Hadoop+with+Knox"><img src="markbook-section-link.png"/></a></h3><h4><a id="Invoke+the+LISTSTATUS+operation+on+WebHDFS+via+the+gateway.">Invoke the LISTSTATUS operation on WebHDFS via the gateway.</a> <a href="#Invoke+the+LISTSTATUS+operation+on+WebHDFS+via+the+gateway."><img src="markbook-section-link.png"/></a></h4><p>This will return a directory listing of the root (i.e. /) directory of HDFS.</p>
+<pre><code>curl -i -k -u guest:guest-password -X GET \
+    &#39;https://localhost:8443/gateway/sandbox/webhdfs/v1/?op=LISTSTATUS&#39;
+</code></pre><p>The results of the above command should result in something to along the lines of the output below. The exact information returned is subject to the content within HDFS in your Hadoop cluster. Successfully executing this command at a minimum proves that the gateway is properly configured to provide access to WebHDFS. It does not necessarily provide that any of the other services are correct configured to be accessible. To validate that see the sections for the individual services in <a href="#Service+Details">Service Details</a>.</p>
+<pre><code>HTTP/1.1 200 OK
+Content-Type: application/json
+Content-Length: 760
+Server: Jetty(6.1.26)
+
+{&quot;FileStatuses&quot;:{&quot;FileStatus&quot;:[
+{&quot;accessTime&quot;:0,&quot;blockSize&quot;:0,&quot;group&quot;:&quot;hdfs&quot;,&quot;length&quot;:0,&quot;modificationTime&quot;:1350595859762,&quot;owner&quot;:&quot;hdfs&quot;,&quot;pathSuffix&quot;:&quot;apps&quot;,&quot;permission&quot;:&quot;755&quot;,&quot;replication&quot;:0,&quot;type&quot;:&quot;DIRECTORY&quot;},
+{&quot;accessTime&quot;:0,&quot;blockSize&quot;:0,&quot;group&quot;:&quot;mapred&quot;,&quot;length&quot;:0,&quot;modificationTime&quot;:1350595874024,&quot;owner&quot;:&quot;mapred&quot;,&quot;pathSuffix&quot;:&quot;mapred&quot;,&quot;permission&quot;:&quot;755&quot;,&quot;replication&quot;:0,&quot;type&quot;:&quot;DIRECTORY&quot;},
+{&quot;accessTime&quot;:0,&quot;blockSize&quot;:0,&quot;group&quot;:&quot;hdfs&quot;,&quot;length&quot;:0,&quot;modificationTime&quot;:1350596040075,&quot;owner&quot;:&quot;hdfs&quot;,&quot;pathSuffix&quot;:&quot;tmp&quot;,&quot;permission&quot;:&quot;777&quot;,&quot;replication&quot;:0,&quot;type&quot;:&quot;DIRECTORY&quot;},
+{&quot;accessTime&quot;:0,&quot;blockSize&quot;:0,&quot;group&quot;:&quot;hdfs&quot;,&quot;length&quot;:0,&quot;modificationTime&quot;:1350595857178,&quot;owner&quot;:&quot;hdfs&quot;,&quot;pathSuffix&quot;:&quot;user&quot;,&quot;permission&quot;:&quot;755&quot;,&quot;replication&quot;:0,&quot;type&quot;:&quot;DIRECTORY&quot;}
+]}}
+</code></pre><h4><a id="Put+a+file+in+HDFS+via+Knox.">Put a file in HDFS via Knox.</a> <a href="#Put+a+file+in+HDFS+via+Knox."><img src="markbook-section-link.png"/></a></h4>
+<pre><code>curl -i -k -u guest:guest-password -X PUT \
+    &#39;https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp/LICENSE?op=CREATE&#39;
+
+curl -i -k -u guest:guest-password -T LICENSE -X PUT \
+    &#39;{Value of Location header from response   above}&#39;
+</code></pre><h4><a id="Get+a+file+in+HDFS+via+Knox.">Get a file in HDFS via Knox.</a> <a href="#Get+a+file+in+HDFS+via+Knox."><img src="markbook-section-link.png"/></a></h4>
+<pre><code>curl -i -k -u guest:guest-password -X GET \
+    &#39;https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp/LICENSE?op=OPEN&#39;
+
+curl -i -k -u guest:guest-password -X GET \
+    &#39;{Value of Location header from command response above}&#39;
+</code></pre><h2><a id="Apache+Knox+Details">Apache Knox Details</a> <a href="#Apache+Knox+Details"><img src="markbook-section-link.png"/></a></h2><p>This section provides everything you need to know to get the Knox gateway up and running against a Hadoop cluster.</p><h4><a id="Hadoop">Hadoop</a> <a href="#Hadoop"><img src="markbook-section-link.png"/></a></h4><p>An existing Hadoop 2.x cluster is required for Knox to sit in front of and protect. It is possible to use a Hadoop cluster deployed on EC2 but this will require additional configuration not covered here. It is also possible to protect access to a services of a Hadoop cluster that is secured with Kerberos. This too requires additional configuration that is described in other sections of this guide. See <a href="#Supported+Services">Supported Services</a> for details on what is supported for this release.</p><p>The Hadoop cluster should be ensured to have at least WebHDFS, WebHCat (i.e. Templeton) and Oozie configured, deploy
 ed and running. HBase/Stargate and Hive can also be accessed via the Knox Gateway given the proper versions and configuration.</p><p>The instructions that follow assume a few things:</p>
+<ol>
+  <li>The gateway is <em>not</em> collocated with the Hadoop clusters themselves.</li>
+  <li>The host names and IP addresses of the cluster services are accessible by the gateway where ever it happens to be running.</li>
+</ol><p>All of the instructions and samples provided here are tailored and tested to work &ldquo;out of the box&rdquo; against a <a href="http://hortonworks.com/products/hortonworks-sandbox">Hortonworks Sandbox 2.x VM</a>.</p><h4><a id="Apache+Knox+Directory+Layout">Apache Knox Directory Layout</a> <a href="#Apache+Knox+Directory+Layout"><img src="markbook-section-link.png"/></a></h4><p>Knox can be installed by expanding the zip/archive file.</p><p>The table below provides a brief explanation of the important files and directories within <code>{GATEWWAY_HOME}</code></p>
+<table>
+  <thead>
+    <tr>
+      <th>Directory </th>
+      <th>Purpose </th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>conf/ </td>
+      <td>Contains configuration files that apply to the gateway globally (i.e. not cluster specific ). </td>
+    </tr>
+    <tr>
+      <td>data/ </td>
+      <td>Contains security and topology specific artifacts that require read/write access at runtime </td>
+    </tr>
+    <tr>
+      <td>conf/topologies/ </td>
+      <td>Contains topology files that represent Hadoop clusters which the gateway uses to deploy cluster proxies </td>
+    </tr>
+    <tr>
+      <td>data/security/ </td>
+      <td>Contains the persisted master secret and keystore dir </td>
+    </tr>
+    <tr>
+      <td>data/security/keystores/ </td>
+      <td>Contains the gateway identity keystore and credential stores for the gateway and each deployed cluster topology </td>
+    </tr>
+    <tr>
+      <td>data/services </td>
+      <td>Contains service behavior definitions for the services currently supported. </td>
+    </tr>
+    <tr>
+      <td>bin/ </td>
+      <td>Contains the executable shell scripts, batch files and JARs for clients and servers. </td>
+    </tr>
+    <tr>
+      <td>data/deployments/ </td>
+      <td>Contains deployed cluster topologies used to protect access to specific Hadoop clusters. </td>
+    </tr>
+    <tr>
+      <td>lib/ </td>
+      <td>Contains the JARs for all the components that make up the gateway. </td>
+    </tr>
+    <tr>
+      <td>dep/ </td>
+      <td>Contains the JARs for all of the components upon which the gateway depends. </td>
+    </tr>
+    <tr>
+      <td>ext/ </td>
+      <td>A directory where user supplied extension JARs can be placed to extends the gateways functionality. </td>
+    </tr>
+    <tr>
+      <td>pids/ </td>
+      <td>Contains the process ids for running ldap and gateway servers </td>
+    </tr>
+    <tr>
+      <td>samples/ </td>
+      <td>Contains a number of samples that can be used to explore the functionality of the gateway. </td>
+    </tr>
+    <tr>
+      <td>templates/ </td>
+      <td>Contains default configuration files that can be copied and customized. </td>
+    </tr>
+    <tr>
+      <td>README </td>
+      <td>Provides basic information about the Apache Knox Gateway. </td>
+    </tr>
+    <tr>
+      <td>ISSUES </td>
+      <td>Describes significant know issues. </td>
+    </tr>
+    <tr>
+      <td>CHANGES </td>
+      <td>Enumerates the changes between releases. </td>
+    </tr>
+    <tr>
+      <td>LICENSE </td>
+      <td>Documents the license under which this software is provided. </td>
+    </tr>
+    <tr>
+      <td>NOTICE </td>
+      <td>Documents required attribution notices for included dependencies. </td>
+    </tr>
+  </tbody>
+</table><h3><a id="Supported+Services">Supported Services</a> <a href="#Supported+Services"><img src="markbook-section-link.png"/></a></h3><p>This table enumerates the versions of various Hadoop services that have been tested to work with the Knox Gateway.</p>
+<table>
+  <thead>
+    <tr>
+      <th>Service </th>
+      <th>Version </th>
+      <th>Non-Secure </th>
+      <th>Secure </th>
+      <th>HA </th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>WebHDFS </td>
+      <td>2.4.0 </td>
+      <td><img src="check.png"  alt="y"/> </td>
+      <td><img src="check.png"  alt="y"/> </td>
+      <td><img src="check.png"  alt="y"/></td>
+    </tr>
+    <tr>
+      <td>WebHCat/Templeton </td>
+      <td>0.13.0 </td>
+      <td><img src="check.png"  alt="y"/> </td>
+      <td><img src="check.png"  alt="y"/> </td>
+      <td><img src="check.png"  alt="y"/></td>
+    </tr>
+    <tr>
+      <td>Oozie </td>
+      <td>4.0.0 </td>
+      <td><img src="check.png"  alt="y"/> </td>
+      <td><img src="check.png"  alt="y"/> </td>
+      <td><img src="check.png"  alt="y"/></td>
+    </tr>
+    <tr>
+      <td>HBase </td>
+      <td>0.98.0 </td>
+      <td><img src="check.png"  alt="y"/> </td>
+      <td><img src="check.png"  alt="y"/> </td>
+      <td><img src="check.png"  alt="y"/></td>
+    </tr>
+    <tr>
+      <td>Hive (via WebHCat) </td>
+      <td>0.13.0 </td>
+      <td><img src="check.png"  alt="y"/> </td>
+      <td><img src="check.png"  alt="y"/> </td>
+      <td><img src="check.png"  alt="y"/></td>
+    </tr>
+    <tr>
+      <td>Hive (via JDBC/ODBC) </td>
+      <td>0.13.0 </td>
+      <td><img src="check.png"  alt="y"/> </td>
+      <td><img src="check.png"  alt="y"/> </td>
+      <td><img src="check.png"  alt="y"/></td>
+    </tr>
+    <tr>
+      <td>Yarn ResourceManager </td>
+      <td>2.5.0 </td>
+      <td><img src="check.png"  alt="y"/> </td>
+      <td><img src="check.png"  alt="y"/> </td>
+      <td><img src="error.png"  alt="n"/></td>
+    </tr>
+    <tr>
+      <td>Storm </td>
+      <td>0.9.3 </td>
+      <td><img src="check.png"  alt="y"/> </td>
+      <td><img src="error.png"  alt="n"/> </td>
+      <td><img src="error.png"  alt="n"/></td>
+    </tr>
+  </tbody>
+</table><h3><a id="More+Examples">More Examples</a> <a href="#More+Examples"><img src="markbook-section-link.png"/></a></h3><p>These examples provide more detail about how to access various Apache Hadoop services via the Apache Knox Gateway.</p>
+<ul>
+  <li><a href="#WebHDFS+Examples">WebHDFS Examples</a></li>
+  <li><a href="#WebHCat+Examples">WebHCat Examples</a></li>
+  <li><a href="#Oozie+Examples">Oozie Examples</a></li>
+  <li><a href="#HBase+Examples">HBase Examples</a></li>
+  <li><a href="#Hive+Examples">Hive Examples</a></li>
+  <li><a href="#Yarn+Examples">Yarn Examples</a></li>
+  <li><a href="#Storm+Examples">Storm Examples</a></li>
+</ul><h3><a id="Gateway+Samples">Gateway Samples</a> <a href="#Gateway+Samples"><img src="markbook-section-link.png"/></a></h3><p>The purpose of the samples within the {GATEWAY_HOME}/samples directory is to demonstrate the capabilities of the Apache Knox Gateway to provide access to the numerous APIs that are available from the service components of a Hadoop cluster.</p><p>Depending on exactly how your Knox installation was done, there will be some number of steps required in order fully install and configure the samples for use.</p><p>This section will help describe the assumptions of the samples and the steps to get them to work in a couple of different deployment scenarios.</p><h4><a id="Assumptions+of+the+Samples">Assumptions of the Samples</a> <a href="#Assumptions+of+the+Samples"><img src="markbook-section-link.png"/></a></h4><p>The samples were initially written with the intent of working out of the box for the various Hadoop demo environments that are deployed as a single no
 de cluster inside of a VM. The following assumptions were made from that context and should be understood in order to get the samples to work in other deployment scenarios:</p>
+<ul>
+  <li>That there is a valid java JDK on the PATH for executing the samples</li>
+  <li>The Knox Demo LDAP server is running on localhost and port 33389 which is the default port for the ApacheDS LDAP server.</li>
+  <li>That the LDAP directory in use has a set of demo users provisioned with the convention of username and username&ldquo;-password&rdquo; as the password. Most of the samples have some variation of this pattern with &ldquo;guest&rdquo; and &ldquo;guest-password&rdquo;.</li>
+  <li>That the Knox Gateway instance is running on the same machine which you will be running the samples from - therefore &ldquo;localhost&rdquo; and that the default port of &ldquo;8443&rdquo; is being used.</li>
+  <li>Finally, that there is a properly provisioned sandbox.xml topology in the <code>{GATEWAY_HOME}/conf/topologies</code> directory that is configured to point to the actual host and ports of running service components.</li>
+</ul><h4><a id="Steps+for+Demo+Single+Node+Clusters">Steps for Demo Single Node Clusters</a> <a href="#Steps+for+Demo+Single+Node+Clusters"><img src="markbook-section-link.png"/></a></h4><p>There should be little to do if anything in a demo environment that has been provisioned with illustrating the use of Apache Knox.</p><p>However, the following items will be worth ensuring before you start:</p>
+<ol>
+  <li>The sandbox.xml topology is configured properly for the deployed services</li>
+  <li>That there is a LDAP server running with guest/guest-password user available in the directory</li>
+</ol><h4><a id="Steps+for+Ambari+Deployed+Knox+Gateway">Steps for Ambari Deployed Knox Gateway</a> <a href="#Steps+for+Ambari+Deployed+Knox+Gateway"><img src="markbook-section-link.png"/></a></h4><p>Apache Knox instances that are under the management of Ambari are generally assumed not to be demo instances. These instances are in place to facilitate development, testing or production Hadoop clusters.</p><p>The Knox samples can however be made to work with Ambari managed Knox instances with a few steps:</p>
+<ol>
+  <li>You need to have ssh access to the environment in order for the localhost assumption within the samples to be valid.</li>
+  <li>The Knox Demo LDAP Server is started - you can start it from Ambari</li>
+  <li>The default.xml topology file can be copied to sandbox.xml in order to satisfy the topology name assumption in the samples.</li>
+  <li><p>Be sure to use an actual Java JRE to run the sample with something like:</p><p>/usr/jdk64/jdk1.7.0_67/bin/java -jar bin/shell.jar samples/ExampleWebHdfsLs.groovy</p></li>
+</ol><h4><a id="Steps+for+a+Manually+Installed+Knox+Gateway">Steps for a Manually Installed Knox Gateway</a> <a href="#Steps+for+a+Manually+Installed+Knox+Gateway"><img src="markbook-section-link.png"/></a></h4><p>For manually installed Knox instances, there is really no way for the installer to know how to configure the topology file for you.</p><p>Essentially, these steps are identical to the Ambari deployed instance except that #3 should be replaced with the configuration of the out of the box sandbox.xml to point the configuration at the proper hosts and ports.</p>
+<ol>
+  <li>You need to have ssh access to the environment in order for the localhost assumption within the samples to be valid.</li>
+  <li>The Knox Demo LDAP Server is started - you can start it from Ambari</li>
+  <li>Change the hosts and ports within the <code>{GATEWAY_HOME}/conf/topologies/sandbox.xml</code> to reflect your actual cluster service locations.</li>
+  <li><p>Be sure to use an actual Java JRE to run the sample with something like:</p><p>/usr/jdk64/jdk1.7.0_67/bin/java -jar bin/shell.jar samples/ExampleWebHdfsLs.groovy</p></li>
+</ol><h2><a id="Gateway+Details">Gateway Details</a> <a href="#Gateway+Details"><img src="markbook-section-link.png"/></a></h2><p>This section describes the details of the Knox Gateway itself. Including: </p>
+<ul>
+  <li>How URLs are mapped between a gateway that services multiple Hadoop clusters and the clusters themselves</li>
+  <li>How the gateway is configured through gateway-site.xml and cluster specific topology files</li>
+  <li>How to configure the various policy enforcement provider features such as authentication, authorization, auditing, hostmapping, etc.</li>
+</ul><h3><a id="URL+Mapping">URL Mapping</a> <a href="#URL+Mapping"><img src="markbook-section-link.png"/></a></h3><p>The gateway functions much like a reverse proxy. As such, it maintains a mapping of URLs that are exposed externally by the gateway to URLs that are provided by the Hadoop cluster.</p><h4><a id="Default+Topology+URLs">Default Topology URLs</a> <a href="#Default+Topology+URLs"><img src="markbook-section-link.png"/></a></h4><p>In order to provide compatibility with the Hadoop java client and existing CLI tools, the Knox Gateway has provided a feature called the Default Topology. This refers to a topology deployment that will be able to route URLs without the additional context that the gateway uses for differentiating from one Hadoop cluster to another. This allows the URLs to match those used by existing clients for that may access webhdfs through the Hadoop file system abstraction.</p><p>When a topology file is deployed with a file name that matches the configured de
 fault topology name, a specialized mapping for URLs is installed for that particular topology. This allows the URLs that are expected by the existing Hadoop CLIs for webhdfs to be used in interacting with the specific Hadoop cluster that is represented by the default topology file.</p><p>The configuration for the default topology name is found in gateway-site.xml as a property called: &ldquo;default.app.topology.name&rdquo;.</p><p>The default value for this property is &ldquo;sandbox&rdquo;.</p><p>Therefore, when deploying the sandbox.xml topology, both of the following example URLs work for the same underlying Hadoop cluster:</p>
+<pre><code>https://{gateway-host}:{gateway-port}/webhdfs
+https://{gateway-host}:{gateway-port}/{gateway-path}/{cluster-name}/webhdfs
+</code></pre><p>These default topology URLs exist for all of the services in the topology.</p><h4><a id="Fully+Qualified+URLs">Fully Qualified URLs</a> <a href="#Fully+Qualified+URLs"><img src="markbook-section-link.png"/></a></h4><p>Examples of mappings for the WebHDFS, WebHCat, Oozie and HBase are shown below. These mapping are generated from the combination of the gateway configuration file (i.e. <code>{GATEWAY_HOME}/conf/gateway-site.xml</code>) and the cluster topology descriptors (e.g. <code>{GATEWAY_HOME}/conf/topologies/{cluster-name}.xml</code>). The port numbers shown for the Cluster URLs represent the default ports for these services. The actual port number may be different for a given cluster.</p>
+<ul>
+  <li>WebHDFS
+  <ul>
+    <li>Gateway: <code>https://{gateway-host}:{gateway-port}/{gateway-path}/{cluster-name}/webhdfs</code></li>
+    <li>Cluster: <code>http://{webhdfs-host}:50070/webhdfs</code></li>
+  </ul></li>
+  <li>WebHCat (Templeton)
+  <ul>
+    <li>Gateway: <code>https://{gateway-host}:{gateway-port}/{gateway-path}/{cluster-name}/templeton</code></li>
+    <li>Cluster: <code>http://{webhcat-host}:50111/templeton}</code></li>
+  </ul></li>
+  <li>Oozie
+  <ul>
+    <li>Gateway: <code>https://{gateway-host}:{gateway-port}/{gateway-path}/{cluster-name}/oozie</code></li>
+    <li>Cluster: <code>http://{oozie-host}:11000/oozie}</code></li>
+  </ul></li>
+  <li>HBase
+  <ul>
+    <li>Gateway: <code>https://{gateway-host}:{gateway-port}/{gateway-path}/{cluster-name}/hbase</code></li>
+    <li>Cluster: <code>http://{hbase-host}:8080</code></li>
+  </ul></li>
+  <li>Hive JDBC
+  <ul>
+    <li>Gateway: <code>jdbc:hive2://{gateway-host}:{gateway-port}/;ssl=true;sslTrustStore={gateway-trust-store-path};trustStorePassword={gateway-trust-store-password};transportMode=http;httpPath={gateway-path}/{cluster-name}/hive</code></li>
+    <li>Cluster: <code>http://{hive-host}:10001/cliservice</code></li>
+  </ul></li>
+</ul><p>The values for <code>{gateway-host}</code>, <code>{gateway-port}</code>, <code>{gateway-path}</code> are provided via the gateway configuration file (i.e. <code>{GATEWAY_HOME}/conf/gateway-site.xml</code>).</p><p>The value for <code>{cluster-name}</code> is derived from the file name of the cluster topology descriptor (e.g. <code>{GATEWAY_HOME}/deployments/{cluster-name}.xml</code>).</p><p>The value for <code>{webhdfs-host}</code>, <code>{webhcat-host}</code>, <code>{oozie-host}</code>, <code>{hbase-host}</code> and <code>{hive-host}</code> are provided via the cluster topology descriptor (e.g. <code>{GATEWAY_HOME}/conf/topologies/{cluster-name}.xml</code>).</p><p>Note: The ports 50070, 50111, 11000, 8080 and 10001 are the defaults for WebHDFS, WebHCat, Oozie, HBase and Hive respectively. Their values can also be provided via the cluster topology descriptor if your Hadoop cluster uses different ports.</p><p>Note: The HBase REST API uses port 8080 by default. This often clash
 es with other running services. In the Hortonworks Sandbox Ambari might be running on this port so you might have to change it to a different port (e.g. 60080). </p><h3><a id="Configuration">Configuration</a> <a href="#Configuration"><img src="markbook-section-link.png"/></a></h3><p>Configuration for Apache Knox includes:</p>
+<ol>
+  <li><a href="#Related+Cluster+Configuration">Related Cluster Configuration</a> that must be done within the Hadoop cluster to allow Knox to communicate with various services</li>
+  <li><a href="#Gateway+Server+Configuration">Gateway Server Configuration</a> - which is the configurable elements of the server itself which applies to behavior that spans all topologies or managed Hadoop clusters</li>
+  <li><a href="#Topology+Descriptors">Topology Descriptors</a> which are the descriptors for controlling access to Hadoop clusters in various ways</li>
+</ol><h3><a id="Related+Cluster+Configuration">Related Cluster Configuration</a> <a href="#Related+Cluster+Configuration"><img src="markbook-section-link.png"/></a></h3><p>The following configuration changes must be made to your cluster to allow Apache Knox to dispatch requests to the various service components on behalf of end users.</p><h4><a id="Grant+Proxy+privileges+for+Knox+user+in+`core-site.xml`+on+Hadoop+master+nodes">Grant Proxy privileges for Knox user in <code>core-site.xml</code> on Hadoop master nodes</a> <a href="#Grant+Proxy+privileges+for+Knox+user+in+`core-site.xml`+on+Hadoop+master+nodes"><img src="markbook-section-link.png"/></a></h4><p>Update <code>core-site.xml</code> and add the following lines towards the end of the file.</p><p>Replace <code>FQDN_OF_KNOX_HOST</code> with the fully qualified domain name of the host running the Knox gateway. You can usually find this by running <code>hostname -f</code> on that host.</p><p>You can use <code>*</code> for local de
 veloper testing if the Knox host does not have a static IP.</p>
+<pre><code>&lt;property&gt;
+    &lt;name&gt;hadoop.proxyuser.knox.groups&lt;/name&gt;
+    &lt;value&gt;users&lt;/value&gt;
+&lt;/property&gt;
+&lt;property&gt;
+    &lt;name&gt;hadoop.proxyuser.knox.hosts&lt;/name&gt;
+    &lt;value&gt;FQDN_OF_KNOX_HOST&lt;/value&gt;
+&lt;/property&gt;
+</code></pre><h4><a id="Grant+proxy+privilege+for+Knox+in+`webhcat-site.xml`+on+Hadoop+master+nodes">Grant proxy privilege for Knox in <code>webhcat-site.xml</code> on Hadoop master nodes</a> <a href="#Grant+proxy+privilege+for+Knox+in+`webhcat-site.xml`+on+Hadoop+master+nodes"><img src="markbook-section-link.png"/></a></h4><p>Update <code>webhcat-site.xml</code> and add the following lines towards the end of the file.</p><p>Replace <code>FQDN_OF_KNOX_HOST</code> with the fully qualified domain name of the host running the Knox gateway. You can use <code>*</code> for local developer testing if the Knox host does not have a static IP.</p>
+<pre><code>&lt;property&gt;
+    &lt;name&gt;webhcat.proxyuser.knox.groups&lt;/name&gt;
+    &lt;value&gt;users&lt;/value&gt;
+&lt;/property&gt;
+&lt;property&gt;
+    &lt;name&gt;webhcat.proxyuser.knox.hosts&lt;/name&gt;
+    &lt;value&gt;FQDN_OF_KNOX_HOST&lt;/value&gt;
+&lt;/property&gt;
+</code></pre><h4><a id="Grant+proxy+privilege+for+Knox+in+`oozie-site.xml`+on+Oozie+host">Grant proxy privilege for Knox in <code>oozie-site.xml</code> on Oozie host</a> <a href="#Grant+proxy+privilege+for+Knox+in+`oozie-site.xml`+on+Oozie+host"><img src="markbook-section-link.png"/></a></h4><p>Update <code>oozie-site.xml</code> and add the following lines towards the end of the file.</p><p>Replace <code>FQDN_OF_KNOX_HOST</code> with the fully qualified domain name of the host running the Knox gateway. You can use <code>*</code> for local developer testing if the Knox host does not have a static IP.</p>
+<pre><code>&lt;property&gt;
+    &lt;name&gt;oozie.service.ProxyUserService.proxyuser.knox.groups&lt;/name&gt;
+    &lt;value&gt;users&lt;/value&gt;
+&lt;/property&gt;
+&lt;property&gt;
+    &lt;name&gt;oozie.service.ProxyUserService.proxyuser.knox.hosts&lt;/name&gt;
+    &lt;value&gt;FQDN_OF_KNOX_HOST&lt;/value&gt;
+&lt;/property&gt;
+</code></pre><h4><a id="Enable+http+transport+mode+and+use+substitution+in+HiveServer2">Enable http transport mode and use substitution in HiveServer2</a> <a href="#Enable+http+transport+mode+and+use+substitution+in+HiveServer2"><img src="markbook-section-link.png"/></a></h4><p>Update <code>hive-site.xml</code> and set the following properties on HiveServer2 hosts. Some of the properties may already be in the hive-site.xml. Ensure that the values match the ones below.</p>
+<pre><code>&lt;property&gt;
+    &lt;name&gt;hive.server2.allow.user.substitution&lt;/name&gt;
+    &lt;value&gt;true&lt;/value&gt;
+&lt;/property&gt;
+
+&lt;property&gt;
+    &lt;name&gt;hive.server2.transport.mode&lt;/name&gt;
+    &lt;value&gt;http&lt;/value&gt;
+    &lt;description&gt;Server transport mode. &quot;binary&quot; or &quot;http&quot;.&lt;/description&gt;
+&lt;/property&gt;
+
+&lt;property&gt;
+    &lt;name&gt;hive.server2.thrift.http.port&lt;/name&gt;
+    &lt;value&gt;10001&lt;/value&gt;
+    &lt;description&gt;Port number when in HTTP mode.&lt;/description&gt;
+&lt;/property&gt;
+
+&lt;property&gt;
+    &lt;name&gt;hive.server2.thrift.http.path&lt;/name&gt;
+    &lt;value&gt;cliservice&lt;/value&gt;
+    &lt;description&gt;Path component of URL endpoint when in HTTP mode.&lt;/description&gt;
+&lt;/property&gt;
+</code></pre><h4><a id="Gateway+Server+Configuration">Gateway Server Configuration</a> <a href="#Gateway+Server+Configuration"><img src="markbook-section-link.png"/></a></h4><p>The following table illustrates the configurable elements of the Apache Knox Gateway at the server level via gateway-site.xml.</p>
+<table>
+  <thead>
+    <tr>
+      <th>property </th>
+      <th>description </th>
+      <th>default</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>gateway.deployment.dir</td>
+      <td>The directory within GATEWAY_HOME that contains gateway topology deployments.</td>
+      <td>{GATEWAY_HOME}/data/deployments</td>
+    </tr>
+    <tr>
+      <td>gateway.security.dir</td>
+      <td>The directory within GATEWAY_HOME that contains the required security artifacts</td>
+      <td>{GATEWAY_HOME}/data/security</td>
+    </tr>
+    <tr>
+      <td>gateway.data.dir</td>
+      <td>The directory within GATEWAY_HOME that contains the gateway instance data</td>
+      <td>{GATEWAY_HOME}/data</td>
+    </tr>
+    <tr>
+      <td>gateway.services.dir</td>
+      <td>The directory within GATEWAY_HOME that contains the gateway services definitions.</td>
+      <td>{GATEWAY_HOME}/services</td>
+    </tr>
+    <tr>
+      <td>gateway.hadoop.conf.dir</td>
+      <td>The directory within GATEWAY_HOME that contains the gateway configuration</td>
+      <td>{GATEWAY_HOME}/conf</td>
+    </tr>
+    <tr>
+      <td>gateway.frontend.url</td>
+      <td>The URL that should be used during rewriting so that it can rewrite the URLs with the correct &ldquo;frontend&rdquo; URL</td>
+      <td>none</td>
+    </tr>
+    <tr>
+      <td>gateway.xforwarded.enabled</td>
+      <td>Indicates whether support for some X-Forwarded-* headers is enabled</td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td>gateway.trust.all.certs</td>
+      <td>Indicates whether all presented client certs should establish trust</td>
+      <td>false</td>
+    </tr>
+    <tr>
+      <td>gateway.client.auth.needed</td>
+      <td>Indicates whether clients are required to establish a trust relationship with client certificates</td>
+      <td>false</td>
+    </tr>
+    <tr>
+      <td>gateway.truststore.path</td>
+      <td>Location of the truststore for client certificates to be trusted</td>
+      <td>gateway.jks</td>
+    </tr>
+    <tr>
+      <td>gateway.truststore.type</td>
+      <td>Indicates the type of truststore</td>
+      <td>JKS</td>
+    </tr>
+    <tr>
+      <td>gateway.keystore.type</td>
+      <td>Indicates the type of keystore for the identity store</td>
+      <td>JKS</td>
+    </tr>
+    <tr>
+      <td>gateway.jdk.tls.ephemeralDHKeySize</td>
+      <td>jdk.tls.ephemeralDHKeySize, is defined to customize the ephemeral DH key sizes. The minimum acceptable DH key size is 1024 bits, except for exportable cipher suites or legacy mode (jdk.tls.ephemeralDHKeySize=legacy)</td>
+      <td>2048</td>
+    </tr>
+    <tr>
+      <td>gateway.threadpool.max</td>
+      <td>The maximum concurrent requests the server will process. The default is 254. Connections beyond this will be queued.</td>
+      <td>254</td>
+    </tr>
+    <tr>
+      <td>gateway.httpclient.maxConnections</td>
+      <td>The maximum number of connections that a single httpclient will maintain to a single host:port. The default is 32.</td>
+      <td>32</td>
+    </tr>
+    <tr>
+      <td>gateway.httpclient.connectionTimeout</td>
+      <td>The amount of time to wait when attempting a connection. The natural unit is milliseconds but a &lsquo;s&rsquo; or &lsquo;m&rsquo; suffix may be used for seconds or minutes respectively. The default timeout is system dependent. </td>
+      <td>System Dependent</td>
+    </tr>
+    <tr>
+      <td>gateway.httpclient.socketTimeout</td>
+      <td>The amount of time to wait for data on a socket before aborting the connection. The natural unit is milliseconds but a &lsquo;s&rsquo; or &lsquo;m&rsquo; suffix may be used for seconds or minutes respectively. The default timeout is system dependent but is likely to be indefinite. </td>
+      <td>System Dependent</td>
+    </tr>
+    <tr>
+      <td>gateway.httpserver.requestBuffer</td>
+      <td>The size of the HTTP server request buffer. The default is 16K.</td>
+      <td>16384</td>
+    </tr>
+    <tr>
+      <td>gateway.httpserver.requestHeaderBuffer</td>
+      <td>The size of the HTTP server request header buffer. The default is 8K.</td>
+      <td>8192</td>
+    </tr>
+    <tr>
+      <td>gateway.httpserver.responseBuffer</td>
+      <td>The size of the HTTP server response buffer. The default is 32K.</td>
+      <td>32768</td>
+    </tr>
+    <tr>
+      <td>gateway.httpserver.responseHeaderBuffer</td>
+      <td>The size of the HTTP server response header buffer. The default is 8K.</td>
+      <td>8192</td>
+    </tr>
+    <tr>
+      <td>ssl.enabled</td>
+      <td>Indicates whether SSL is enabled for the Gateway</td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td>ssl.include.ciphers</td>
+      <td>A comma separated list of ciphers to accept for SSL. See the <a href="http://docs.oracle.com/javase/8/docs/technotes/guides/security/SunProviders.html#SunJSSEProvider">JSSE Provider docs</a> for possible ciphers. These can also contain regular expressions as shown in the <a href="http://www.eclipse.org/jetty/documentation/current/configuring-ssl.html">Jetty documentation</a>.</td>
+      <td>all</td>
+    </tr>
+    <tr>
+      <td>ssl.exclude.ciphers</td>
+      <td>A comma separated list of ciphers to reject for SSL. See the <a href="http://docs.oracle.com/javase/8/docs/technotes/guides/security/SunProviders.html#SunJSSEProvider">JSSE Provider docs</a> for possible ciphers. These can also contain regular expressions as shown in the <a href="http://www.eclipse.org/jetty/documentation/current/configuring-ssl.html">Jetty documentation</a>.</td>
+      <td>none</td>
+    </tr>
+    <tr>
+      <td>ssl.exclude.protocols</td>
+      <td>Excludes a comma separated list of protocols to not accept for SSL or &ldquo;none&rdquo;</td>
+      <td>SSLv3</td>
+    </tr>
+  </tbody>
+</table><h4><a id="Topology+Descriptors">Topology Descriptors</a> <a href="#Topology+Descriptors"><img src="markbook-section-link.png"/></a></h4><p>The topology descriptor files provide the gateway with per-cluster configuration information. This includes configuration for both the providers within the gateway and the services within the Hadoop cluster. These files are located in <code>{GATEWAY_HOME}/conf/topologies</code>. The general outline of this document looks like this.</p>
+<pre><code>&lt;topology&gt;
+    &lt;gateway&gt;
+        &lt;provider&gt;
+        &lt;/provider&gt;
+    &lt;/gateway&gt;
+    &lt;service&gt;
+    &lt;/service&gt;
+&lt;/topology&gt;
+</code></pre><p>There are typically multiple <code>&lt;provider&gt;</code> and <code>&lt;service&gt;</code> elements.</p>
+<dl><dt>/topology</dt><dd>Defines the provider and configuration and service topology for a single Hadoop cluster.</dd><dt>/topology/gateway</dt><dd>Groups all of the provider elements</dd><dt>/topology/gateway/provider</dt><dd>Defines the configuration of a specific provider for the cluster.</dd><dt>/topology/service</dt><dd>Defines the location of a specific Hadoop service within the Hadoop cluster.</dd>
+</dl><h5><a id="Provider+Configuration">Provider Configuration</a> <a href="#Provider+Configuration"><img src="markbook-section-link.png"/></a></h5><p>Provider configuration is used to customize the behavior of a particular gateway feature. The general outline of a provider element looks like this.</p>
+<pre><code>&lt;provider&gt;
+    &lt;role&gt;authentication&lt;/role&gt;
+    &lt;name&gt;ShiroProvider&lt;/name&gt;
+    &lt;enabled&gt;true&lt;/enabled&gt;
+    &lt;param&gt;
+        &lt;name&gt;&lt;/name&gt;
+        &lt;value&gt;&lt;/value&gt;
+    &lt;/param&gt;
+&lt;/provider&gt;
+</code></pre>
+<dl><dt>/topology/gateway/provider</dt><dd>Groups information for a specific provider.</dd><dt>/topology/gateway/provider/role</dt><dd>Defines the role of a particular provider. There are a number of pre-defined roles used by out-of-the-box provider plugins for the gateway. These roles are: authentication, identity-assertion, authentication, rewrite and hostmap</dd><dt>/topology/gateway/provider/name</dt><dd>Defines the name of the provider for which this configuration applies. There can be multiple provider implementations for a given role. Specifying the name is used identify which particular provider is being configured. Typically each topology descriptor should contain only one provider for each role but there are exceptions.</dd><dt>/topology/gateway/provider/enabled</dt><dd>Allows a particular provider to be enabled or disabled via <code>true</code> or <code>false</code> respectively. When a provider is disabled any filters associated with that provider are excluded from the p
 rocessing chain.</dd><dt>/topology/gateway/provider/param</dt><dd>These elements are used to supply provider configuration. There can be zero or more of these per provider.</dd><dt>/topology/gateway/provider/param/name</dt><dd>The name of a parameter to pass to the provider.</dd><dt>/topology/gateway/provider/param/value</dt><dd>The value of a parameter to pass to the provider.</dd>
+</dl><h5><a id="Service+Configuration">Service Configuration</a> <a href="#Service+Configuration"><img src="markbook-section-link.png"/></a></h5><p>Service configuration is used to specify the location of services within the Hadoop cluster. The general outline of a service element looks like this.</p>
+<pre><code>&lt;service&gt;
+    &lt;role&gt;WEBHDFS&lt;/role&gt;
+    &lt;url&gt;http://localhost:50070/webhdfs&lt;/url&gt;
+&lt;/service&gt;
+</code></pre>
+<dl><dt>/topology/service</dt><dd>Provider information about a particular service within the Hadoop cluster. Not all services are necessarily exposed as gateway endpoints.</dd><dt>/topology/service/role</dt><dd>Identifies the role of this service. Currently supported roles are: WEBHDFS, WEBHCAT, WEBHBASE, OOZIE, HIVE, NAMENODE, JOBTRACKER, RESOURCEMANAGER Additional service roles can be supported via plugins.</dd><dt>topology/service/url</dt><dd>The URL identifying the location of a particular service within the Hadoop cluster.</dd>
+</dl><h4><a id="Hostmap+Provider">Hostmap Provider</a> <a href="#Hostmap+Provider"><img src="markbook-section-link.png"/></a></h4><p>The purpose of the Hostmap provider is to handle situations where host are known by one name within the cluster and another name externally. This frequently occurs when virtual machines are used and in particular when using cloud hosting services. Currently, the Hostmap provider is configured as part of the topology file. The basic structure is shown below.</p>
+<pre><code>&lt;topology&gt;
+    &lt;gateway&gt;
+        ...
+        &lt;provider&gt;
+            &lt;role&gt;hostmap&lt;/role&gt;
+            &lt;name&gt;static&lt;/name&gt;
+            &lt;enabled&gt;true&lt;/enabled&gt;
+            &lt;param&gt;&lt;name&gt;external-host-name&lt;/name&gt;&lt;value&gt;internal-host-name&lt;/value&gt;&lt;/param&gt;
+        &lt;/provider&gt;
+        ...
+    &lt;/gateway&gt;
+    ...
+&lt;/topology&gt;
+</code></pre><p>This mapping is required because the Hadoop services running within the cluster are unaware that they are being accessed from outside the cluster. Therefore URLs returned as part of REST API responses will typically contain internal host names. Since clients outside the cluster will be unable to resolve those host name they must be mapped to external host names.</p><h5><a id="Hostmap+Provider+Example+-+EC2">Hostmap Provider Example - EC2</a> <a href="#Hostmap+Provider+Example+-+EC2"><img src="markbook-section-link.png"/></a></h5><p>Consider an EC2 example where two VMs have been allocated. Each VM has an external host name by which it can be accessed via the internet. However the EC2 VM is unaware of this external host name and instead is configured with the internal host name.</p>
+<pre><code>External HOSTNAMES:
+ec2-23-22-31-165.compute-1.amazonaws.com
+ec2-23-23-25-10.compute-1.amazonaws.com
+
+Internal HOSTNAMES:
+ip-10-118-99-172.ec2.internal
+ip-10-39-107-209.ec2.internal
+</code></pre><p>The Hostmap configuration required to allow access external to the Hadoop cluster via the Apache Knox Gateway would be this.</p>
+<pre><code>&lt;topology&gt;
+    &lt;gateway&gt;
+        ...
+        &lt;provider&gt;
+            &lt;role&gt;hostmap&lt;/role&gt;
+            &lt;name&gt;static&lt;/name&gt;
+            &lt;enabled&gt;true&lt;/enabled&gt;
+            &lt;param&gt;
+                &lt;name&gt;ec2-23-22-31-165.compute-1.amazonaws.com&lt;/name&gt;
+                &lt;value&gt;ip-10-118-99-172.ec2.internal&lt;/value&gt;
+            &lt;/param&gt;
+            &lt;param&gt;
+                &lt;name&gt;ec2-23-23-25-10.compute-1.amazonaws.com&lt;/name&gt;
+                &lt;value&gt;ip-10-39-107-209.ec2.internal&lt;/value&gt;
+            &lt;/param&gt;
+        &lt;/provider&gt;
+        ...
+    &lt;/gateway&gt;
+    ...
+&lt;/topology&gt;
+</code></pre><h5><a id="Hostmap+Provider+Example+-+Sandbox">Hostmap Provider Example - Sandbox</a> <a href="#Hostmap+Provider+Example+-+Sandbox"><img src="markbook-section-link.png"/></a></h5><p>The Hortonworks Sandbox 2.x poses a different challenge for host name mapping. This version of the Sandbox uses port mapping to make the Sandbox VM appear as though it is accessible via localhost. However the Sandbox VM is internally configured to consider sandbox.hortonworks.com as the host name. So from the perspective of a client accessing Sandbox the external host name is localhost. The Hostmap configuration required to allow access to Sandbox from the host operating system is this.</p>
+<pre><code>&lt;topology&gt;
+    &lt;gateway&gt;
+        ...
+        &lt;provider&gt;
+            &lt;role&gt;hostmap&lt;/role&gt;
+            &lt;name&gt;static&lt;/name&gt;
+            &lt;enabled&gt;true&lt;/enabled&gt;
+            &lt;param&gt;
+                &lt;name&gt;localhost&lt;/name&gt;
+                &lt;value&gt;sandbox,sandbox.hortonworks.com&lt;/value&gt;
+            &lt;/param&gt;
+        &lt;/provider&gt;
+        ...
+    &lt;/gateway&gt;
+    ...
+&lt;/topology&gt;
+</code></pre><h5><a id="Hostmap+Provider+Configuration">Hostmap Provider Configuration</a> <a href="#Hostmap+Provider+Configuration"><img src="markbook-section-link.png"/></a></h5><p>Details about each provider configuration element is enumerated below.</p>
+<dl><dt>topology/gateway/provider/role</dt><dd>The role for a Hostmap provider must always be <code>hostmap</code>.</dd><dt>topology/gateway/provider/name</dt><dd>The Hostmap provider supplied out-of-the-box is selected via the name <code>static</code>.</dd><dt>topology/gateway/provider/enabled</dt><dd>Host mapping can be enabled or disabled by providing <code>true</code> or <code>false</code>.</dd><dt>topology/gateway/provider/param</dt><dd>Host mapping is configured by providing parameters for each external to internal mapping.</dd><dt>topology/gateway/provider/param/name</dt><dd>The parameter names represent an external host names associated with the internal host names provided by the value element. This can be a comma separated list of host names that all represent the same physical host. When mapping from internal to external host name the first external host name in the list is used.</dd><dt>topology/gateway/provider/param/value</dt><dd>The parameter values represent the inte
 rnal host names associated with the external host names provider by the name element. This can be a comma separated list of host names that all represent the same physical host. When mapping from external to internal host names the first internal host name in the list is used.</dd>
+</dl><h4><a id="Logging">Logging</a> <a href="#Logging"><img src="markbook-section-link.png"/></a></h4><p>If necessary you can enable additional logging by editing the <code>log4j.properties</code> file in the <code>conf</code> directory. Changing the <code>rootLogger</code> value from <code>ERROR</code> to <code>DEBUG</code> will generate a large amount of debug logging. A number of useful, more fine loggers are also provided in the file.</p><h4><a id="Java+VM+Options">Java VM Options</a> <a href="#Java+VM+Options"><img src="markbook-section-link.png"/></a></h4><p>TODO - Java VM options doc.</p><h4><a id="Persisting+the+Master+Secret">Persisting the Master Secret</a> <a href="#Persisting+the+Master+Secret"><img src="markbook-section-link.png"/></a></h4><p>The master secret is required to start the server. This secret is used to access secured artifacts by the gateway instance. Keystore, trust stores and credential stores are all protected with the master secret.</p><p>You may persi
 st the master secret by supplying the <em>-persist-master</em> switch at startup. This will result in a warning indicating that persisting the secret is less secure than providing it at startup. We do make some provisions in order to protect the persisted password.</p><p>It is encrypted with AES 128 bit encryption and where possible the file permissions are set to only be accessible by the user that the gateway is running as.</p><p>After persisting the secret, ensure that the file at config/security/master has the appropriate permissions set for your environment. This is probably the most important layer of defense for master secret. Do not assume that the encryption if sufficient protection.</p><p>A specific user should be created to run the gateway this user will be the only user with permissions for the persisted master file.</p><p>See the Knox CLI section for descriptions of the command line utilities related to the master secret.</p><h4><a id="Management+of+Security+Artifacts">
 Management of Security Artifacts</a> <a href="#Management+of+Security+Artifacts"><img src="markbook-section-link.png"/></a></h4><p>There are a number of artifacts that are used by the gateway in ensuring the security of wire level communications, access to protected resources and the encryption of sensitive data. These artifacts can be managed from outside of the gateway instances or generated and populated by the gateway instance itself.</p><p>The following is a description of how this is coordinated with both standalone (development, demo, etc) gateway instances and instances as part of a cluster of gateways in mind.</p><p>Upon start of the gateway server we:</p>
+<ol>
+  <li>Look for an identity store at <code>data/security/keystores/gateway.jks</code>.  The identity store contains the certificate and private key used to represent the identity of the server for SSL connections and signature creation.
+  <ul>
+    <li>If there is no identity store we create one and generate a self-signed certificate for use in standalone/demo mode.  The certificate is stored with an alias of gateway-identity.</li>
+    <li>If there is an identity store found than we ensure that it can be loaded using the provided master secret and that there is an alias called gateway-identity.</li>
+  </ul></li>
+  <li>Look for a credential store at <code>data/security/keystores/__gateway-credentials.jceks</code>.  This credential store is used to store secrets/passwords that are used by the gateway.  For instance, this is where the passphrase for accessing the gateway-identity certificate is kept.
+  <ul>
+    <li>If there is no credential store found then we create one and populate it with a generated passphrase for the alias <code>gateway-identity-passphrase</code>.  This is coordinated with the population of the self-signed cert into the identity-store.</li>
+    <li>If a credential store is found then we ensure that it can be loaded using the provided master secret and that the expected aliases have been populated with secrets.</li>
+  </ul></li>
+</ol><p>Upon deployment of a Hadoop cluster topology within the gateway we:</p>
+<ol>
+  <li>Look for a credential store for the topology. For instance, we have a sample topology that gets deployed out of the box. We look for <code>data/security/keystores/sandbox-credentials.jceks</code>. This topology specific credential store is used for storing secrets/passwords that are used for encrypting sensitive data with topology specific keys.
+  <ul>
+    <li>If no credential store is found for the topology being deployed then one is created for it.  Population of the aliases is delegated to the configured providers within the system that will require the use of a secret for a particular task.  They may programmatic set the value of the secret or choose to have the value for the specified alias generated through the AliasService.</li>
+    <li>If a credential store is found then we ensure that it can be loaded with the provided master secret and the configured providers have the opportunity to ensure that the aliases are populated and if not to populate them.</li>
+  </ul></li>
+</ol><p>By leveraging the algorithm described above we can provide a window of opportunity for management of these artifacts in a number of ways.</p>
+<ol>
+  <li>Using a single gateway instance as a master instance the artifacts can be generated or placed into the expected location and then replicated across all of the slave instances before startup.</li>
+  <li>Using an NFS mount as a central location for the artifacts would provide a single source of truth without the need to replicate them over the network. Of course, NFS mounts have their own challenges.</li>
+  <li>Using the KnoxCLI to create and manage the security artifacts.</li>
+</ol><p>See the Knox CLI section for descriptions of the command line utilities related to the security artifact management.</p><h4><a id="Keystores">Keystores</a> <a href="#Keystores"><img src="markbook-section-link.png"/></a></h4><p>In order to provide your own certificate for use by the gateway, you will need to either import an existing key pair into a Java keystore or generate a self-signed cert using the Java keytool.</p><h5><a id="Importing+a+key+pair+into+a+Java+keystore">Importing a key pair into a Java keystore</a> <a href="#Importing+a+key+pair+into+a+Java+keystore"><img src="markbook-section-link.png"/></a></h5><p>One way to accomplish this is to start with a PKCS12 store for your key pair and then convert it to a Java keystore or JKS.</p><p>The following example uses openssl to create a PKCS12 encoded store from your provided certificate and private key that are in PEM format.</p>
+<pre><code>openssl pkcs12 -export -in cert.pem -inkey key.pem &gt; server.p12
+</code></pre><p>The next example converts the PKCS12 store into a Java keystore (JKS). It should prompt you for the keystore and key passwords for the destination keystore. You must use the master-secret for the keystore password and keep track of the password that you use for the key passphrase.</p>
+<pre><code>keytool -importkeystore -srckeystore server.p12 -destkeystore gateway.jks -srcstoretype pkcs12
+</code></pre><p>While using this approach a couple of important things to be aware of:</p>
+<ol>
+  <li><p>the alias MUST be &ldquo;gateway-identity&rdquo;. You may need to change it using keytool after the import of the PKCS12 store. You can use keytool to do this - for example:</p>
+  <pre><code>keytool -changealias -alias &quot;1&quot; -destalias &quot;gateway-identity&quot; -keystore gateway.jks -storepass {knoxpw}
+</code></pre></li>
+  <li><p>the name of the expected identity keystore for the gateway MUST be gateway.jks</p></li>
+  <li><p>the passwords for the keystore and the imported key may both be set to the master secret for the gateway install. You can change the key passphrase after import using keytool as well. You may need to do this in order to provision the password in the credential store as described later in this section. For example:</p>
+  <pre><code>keytool -keypasswd -alias gateway-identity -keystore gateway.jks
+</code></pre></li>
+</ol><p>NOTE: The password for the keystore as well as that of the imported key may be the master secret for the gateway instance or you may set the gateway-identity-passphrase alias using the Knox CLI to the actual key passphrase. See the Knox CLI section for details.</p><p>The following will allow you to provision the passphrase for the private key that you set during keystore creation above - it will prompt you for the actual passphrase.</p>
+<pre><code>bin/knoxcli.sh create-alias gateway-identity-passphrase
+</code></pre><h5><a id="Generating+a+self-signed+cert+for+use+in+testing+or+development+environments">Generating a self-signed cert for use in testing or development environments</a> <a href="#Generating+a+self-signed+cert+for+use+in+testing+or+development+environments"><img src="markbook-section-link.png"/></a></h5>
+<pre><code>keytool -genkey -keyalg RSA -alias gateway-identity -keystore gateway.jks \
+    -storepass {master-secret} -validity 360 -keysize 2048
+</code></pre><p>Keytool will prompt you for a number of elements used will comprise the distinguished name (DN) within your certificate. </p><p><em>NOTE:</em> When it prompts you for your First and Last name be sure to type in the hostname of the machine that your gateway instance will be running on. This is used by clients during hostname verification to ensure that the presented certificate matches the hostname that was used in the URL for the connection - so they need to match.</p><p><em>NOTE:</em> When it prompts for the key password just press enter to ensure that it is the same as the keystore password. Which, as was described earlier, must match the master secret for the gateway instance. Alternatively, you can set it to another passphrase - take note of it and set the gateway-identity-passphrase alias to that passphrase using the Knox CLI.</p><p>See the Knox CLI section for descriptions of the command line utilities related to the management of the keystores.</p><h5><a id="U
 sing+a+CA+Signed+Key+Pair">Using a CA Signed Key Pair</a> <a href="#Using+a+CA+Signed+Key+Pair"><img src="markbook-section-link.png"/></a></h5><p>For certain deployments a certificate key pair that is signed by a trusted certificate authority is required. There are a number of different ways in which these certificates are acquired and can be converted and imported into the Apache Knox keystore.</p><p>The following steps have been used to do this and are provided here for guidance in your installation. You may have to adjust according to your environment.</p><p>General steps:</p>
+<ol>
+  <li><p>Stop Knox gateway and back up all files in <code>{GATEWWAY_HOME}/data/security/keystores</code></p>
+  <pre><code>gateway.sh stop
+</code></pre></li>
+  <li><p>Create a new master key for Knox and persist it. The master key will be referred to in following steps as <code>$master-key</code></p>
+  <pre><code>knoxcli.sh create-master -force
+</code></pre></li>
+  <li><p>Create identity keystore gateway.jks. cert in alias gateway-identity </p>
+  <pre><code>cd {GATEWWAY_HOME}/data/security/keystore  
+keytool -genkeypair -alias gateway-identity -keyalg RSA -keysize 1024 -dname &quot;CN=$fqdn_knox,OU=hdp,O=sdge&quot; -keypass $keypass -keystore gateway.jks -storepass $master-key -validity 300  
+</code></pre><p>NOTE: <code>$fqdn_knox</code> is the hostname of the Knox host. Some may choose <code>$keypass</code> to be the same as <code>$master-key</code>.</p></li>
+  <li><p>Create credential store to store the <code>$keypass</code> in step 3. This creates <code>__gateway-credentials.jceks</code> file</p>
+  <pre><code>knoxcli.sh create-alias gateway-identity-passphrase --value $keypass
+</code></pre></li>
+  <li><p>Generate a certificate signing request from the gateway.jks</p>
+  <pre><code>keytool -keystore gateway.jks -storepass $master-key -alias gateway-identity -certreq -file knox.csr
+</code></pre></li>
+  <li><p>Send the <code>knox.csr</code> file to the CA authority and get back the signed certificate (<code>knox.signed</code>). You also need the CA certificate, which normally can be requested through an openssl command or web browser or from the CA.</p></li>
+  <li><p>Import both the CA authority certificate (referred as <code>corporateCA.cer</code>) and the signed Knox certificate back into <code>gateway.jks</code></p>
+  <pre><code>keytool -keystore gateway.jks -storepass $master-key -alias $hwhq -import -file corporateCA.cer  
+keytool -keystore gateway.jks -storepass $master-key -alias gateway-identity -import -file knox.signed  
+</code></pre><p>NOTE: Use any alias appropriate for the corporate CA.</p></li>
+  <li><p>Restart Knox gateway. Check <code>gateway.log</code> to check whether the gateway started properly and clusters are deployed. You can check the timestamp on cluster deployment files</p>
+  <pre><code>ls -alrt {GATEWAY_HOME}/data/deployment
+</code></pre></li>
+  <li><p>Verify that clients can use the CA authority cert to access Knox (which is the goal of using public signed cert) using curl or a web browsers which has the CA certificate installed</p>
+  <pre><code>curl --cacert supwin12ad.cer -u hdptester:hadoop -X GET &#39;https://$fqdn_knox:8443/gateway/$topologyname/webhdfs/v1/tmp?op=LISTSTATUS&#39;
+</code></pre></li>
+</ol><h5><a id="Credential+Store">Credential Store</a> <a href="#Credential+Store"><img src="markbook-section-link.png"/></a></h5><p>Whenever you provide your own keystore with either a self-signed cert or an issued certificate signed by a trusted authority, you will need to set an alias for the gateway-identity-passphrase or create an empty credential store. This is necessary for the current release in order for the system to determine the correct password for the keystore and the key.</p><p>The credential stores in Knox use the JCEKS keystore type as it allows for the storage of general secrets in addition to certificates.</p><p>Keytool may be used to create credential stores but the Knox CLI section details how to create aliases. These aliases are managed within credential stores which are created by the CLI as needed. The simplest approach is to create the gateway-identity-passpharse alias with the Knox CLI. This will create the credential store if it doesn&rsquo;t already exist
  and add the key passphrase.</p><p>See the Knox CLI section for descriptions of the command line utilities related to the management of the credential stores.</p><h5><a id="Provisioning+of+Keystores">Provisioning of Keystores</a> <a href="#Provisioning+of+Keystores"><img src="markbook-section-link.png"/></a></h5><p>Once you have created these keystores you must move them into place for the gateway to discover them and use them to represent its identity for SSL connections. This is done by copying the keystores to the <code>{GATEWAY_HOME}/data/security/keystores</code> directory for your gateway install.</p><h4><a id="Summary+of+Secrets+to+be+Managed">Summary of Secrets to be Managed</a> <a href="#Summary+of+Secrets+to+be+Managed"><img src="markbook-section-link.png"/></a></h4>
+<ol>
+  <li>Master secret - the same for all gateway instances in a cluster of gateways</li>
+  <li>All security related artifacts are protected with the master secret</li>
+  <li>Secrets used by the gateway itself are stored within the gateway credential store and are the same across all gateway instances in the cluster of gateways</li>
+  <li>Secrets used by providers within cluster topologies are stored in topology specific credential stores and are the same for the same topology across the cluster of gateway instances.  However, they are specific to the topology - so secrets for one hadoop cluster are different from those of another.  This allows for fail-over from one gateway instance to another even when encryption is being used while not allowing the compromise of one encryption key to expose the data for all clusters.</li>
+</ol><p>NOTE: the SSL certificate will need special consideration depending on the type of certificate. Wildcard certs may be able to be shared across all gateway instances in a cluster. When certs are dedicated to specific machines the gateway identity store will not be able to be blindly replicated as host name verification problems will ensue. Obviously, trust-stores will need to be taken into account as well.</p><h3><a id="Knox+CLI">Knox CLI</a> <a href="#Knox+CLI"><img src="markbook-section-link.png"/></a></h3><p>The Knox CLI is a command line utility for the management of various aspects of the Knox deployment. It is primarily concerned with the management of the security artifacts for the gateway instance and each of the deployed topologies or Hadoop clusters that are gated by the Knox Gateway instance.</p><p>The various security artifacts are also generated and populated automatically by the Knox Gateway runtime when they are not found at startup. The assumptions made in tho
 se cases are appropriate for a test or development gateway instance and assume &lsquo;localhost&rsquo; for hostname specific activities. For production deployments the use of the CLI may aid in managing some production deployments.</p><p>The knoxcli.sh script is located in the <code>{GATEWAY_HOME}/bin</code> directory.</p><h4><a id="Help">Help</a> <a href="#Help"><img src="markbook-section-link.png"/></a></h4><h5><a id="`bin/knoxcli.sh+[--help]`"><code>bin/knoxcli.sh [--help]</code></a> <a href="#`bin/knoxcli.sh+[--help]`"><img src="markbook-section-link.png"/></a></h5><p>prints help for all commands</p><h4><a id="Knox+Version+Info">Knox Version Info</a> <a href="#Knox+Version+Info"><img src="markbook-section-link.png"/></a></h4><h5><a id="`bin/knoxcli.sh+version+[--help]`"><code>bin/knoxcli.sh version [--help]</code></a> <a href="#`bin/knoxcli.sh+version+[--help]`"><img src="markbook-section-link.png"/></a></h5><p>Displays Knox version information.</p><h4><a id="Master+secret+persi
 stence">Master secret persistence</a> <a href="#Master+secret+persistence"><img src="markbook-section-link.png"/></a></h4><h5><a id="`bin/knoxcli.sh+create-master+[--force][--help]`"><code>bin/knoxcli.sh create-master [--force][--help]</code></a> <a href="#`bin/knoxcli.sh+create-master+[--force][--help]`"><img src="markbook-section-link.png"/></a></h5><p>Creates and persists an encrypted master secret in a file within <code>{GATEWAY_HOME}/data/security/master</code>. </p><p>NOTE: This command fails when there is an existing master file in the expected location. You may force it to overwrite the master file with the --force switch. NOTE: this will require you to change passwords protecting the keystores for the gateway identity keystores and all credential stores.</p><h4><a id="Alias+creation">Alias creation</a> <a href="#Alias+creation"><img src="markbook-section-link.png"/></a></h4><h5><a id="`bin/knoxcli.sh+create-alias+name+[--cluster+c]+[--value+v]+[--generate]+[--help]`"><code>
 bin/knoxcli.sh create-alias name [--cluster c] [--value v] [--generate] [--help]</code></a> <a href="#`bin/knoxcli.sh+create-alias+name+[--cluster+c]+[--value+v]+[--generate]+[--help]`"><img src="markbook-section-link.png"/></a></h5><p>Creates a password alias and stores it in a credential store within the <code>{GATEWAY_HOME}/data/security/keystores</code> dir. </p>
+<table>
+  <thead>
+    <tr>
+      <th>argument </th>
+      <th>description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>name</td>
+      <td>name of the alias to create</td>
+    </tr>
+    <tr>
+      <td>--cluster</td>
+      <td>name of Hadoop cluster for the cluster specific credential store otherwise assumes that it is for the gateway itself</td>
+    </tr>
+    <tr>
+      <td>--value</td>
+      <td>parameter for specifying the actual password otherwise prompted. Escape complex passwords or surround with single quotes.<br/></td>
+    </tr>
+    <tr>
+      <td>--generate</td>
+      <td>boolean flag to indicate whether the tool should just generate the value. This assumes that --value is not set - will result in error otherwise. User will not be prompted for the value when --generate is set.</td>
+    </tr>
+  </tbody>
+</table><h4><a id="Alias+deletion">Alias deletion</a> <a href="#Alias+deletion"><img src="markbook-section-link.png"/></a></h4><h5><a id="`bin/knoxcli.sh+delete-alias+name+[--cluster+c]+[--help]`"><code>bin/knoxcli.sh delete-alias name [--cluster c] [--help]</code></a> <a href="#`bin/knoxcli.sh+delete-alias+name+[--cluster+c]+[--help]`"><img src="markbook-section-link.png"/></a></h5><p>Deletes a password and alias mapping from a credential store within <code>{GATEWAY_HOME}/data/security/keystores</code>.</p>
+<table>
+  <thead>
+    <tr>
+      <th>argument </th>
+      <th>description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>name </td>
+      <td>name of the alias to delete</td>
+    </tr>
+    <tr>
+      <td>--cluster </td>
+      <td>name of Hadoop cluster for the cluster specific credential store otherwise assumes &rsquo;__gateway&rsquo;</td>
+    </tr>
+  </tbody>
+</table><h4><a id="Alias+listing">Alias listing</a> <a href="#Alias+listing"><img src="markbook-section-link.png"/></a></h4><h5><a id="`bin/knoxcli.sh+list-alias+[--cluster+c]+[--help]`"><code>bin/knoxcli.sh list-alias [--cluster c] [--help]</code></a> <a href="#`bin/knoxcli.sh+list-alias+[--cluster+c]+[--help]`"><img src="markbook-section-link.png"/></a></h5><p>Lists the alias names for the credential store within <code>{GATEWAY_HOME}/data/security/keystores</code>.</p><p>NOTE: This command will list the aliases in lowercase which is a result of the underlying credential store implementation. Lookup of credentials is a case insensitive operation - so this is not an issue.</p>
+<table>
+  <thead>
+    <tr>
+      <th>argument </th>
+      <th>description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>--cluster </td>
+      <td>name of Hadoop cluster for the cluster specific credential store otherwise assumes &rsquo;__gateway&rsquo;</td>
+    </tr>
+  </tbody>
+</table><h4><a id="Self-signed+cert+creation">Self-signed cert creation</a> <a href="#Self-signed+cert+creation"><img src="markbook-section-link.png"/></a></h4><h5><a id="`bin/knoxcli.sh+create-cert+[--hostname+n]+[--help]`"><code>bin/knoxcli.sh create-cert [--hostname n] [--help]</code></a> <a href="#`bin/knoxcli.sh+create-cert+[--hostname+n]+[--help]`"><img src="markbook-section-link.png"/></a></h5><p>Creates and stores a self-signed certificate to represent the identity of the gateway instance. This is stored within the <code>{GATEWAY_HOME}/data/security/keystores/gateway.jks</code> keystore. </p>
+<table>
+  <thead>
+    <tr>
+      <th>argument </th>
+      <th>description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>--hostname</td>
+      <td>name of the host to be used in the self-signed certificate. This allows multi-host deployments to specify the proper hostnames for hostname verification to succeed on the client side of the SSL connection. The default is &lsquo;localhost&rsquo;.</td>
+    </tr>
+  </tbody>
+</table><h4><a id="Topology+Redeploy">Topology Redeploy</a> <a href="#Topology+Redeploy"><img src="markbook-section-link.png"/></a></h4><h5><a id="`bin/knoxcli.sh+redeploy+[--cluster+c]`"><code>bin/knoxcli.sh redeploy [--cluster c]</code></a> <a href="#`bin/knoxcli.sh+redeploy+[--cluster+c]`"><img src="markbook-section-link.png"/></a></h5><p>Redeploys one or all of the gateway&rsquo;s clusters (a.k.a topologies).</p><h4><a id="Topology+Listing">Topology Listing</a> <a href="#Topology+Listing"><img src="markbook-section-link.png"/></a></h4><h5><a id="`bin/knoxcli.sh+list-topologies+[--help]`"><code>bin/knoxcli.sh list-topologies [--help]</code></a> <a href="#`bin/knoxcli.sh+list-topologies+[--help]`"><img src="markbook-section-link.png"/></a></h5><p>Lists all of the topologies found in Knox&rsquo;s topologies directory. Useful for specifying a valid &ndash;cluster argument.</p><h4><a id="Topology+Validation">Topology Validation</a> <a href="#Topology+Validation"><img src="markbook-se
 ction-link.png"/></a></h4><h5><a id="`bin/knoxcli.sh+validate-topology+[--cluster+c]+[--path+path]+[--help]`"><code>bin/knoxcli.sh validate-topology [--cluster c] [--path path] [--help]</code></a> <a href="#`bin/knoxcli.sh+validate-topology+[--cluster+c]+[--path+path]+[--help]`"><img src="markbook-section-link.png"/></a></h5><p>This ensures that a cluster&rsquo;s description (a.k. topology) follows the correct formatting rules. It is possible to specify a name of a cluster already in the topology directory, or a path to any file.</p>
+<table>
+  <thead>
+    <tr>
+      <th>argument </th>
+      <th>description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>--cluster </td>
+      <td>name of Hadoop cluster for which you want to validate</td>
+    </tr>
+    <tr>
+      <td>--path </td>
+      <td>path to topology file that you wish to validate.</td>
+    </tr>
+  </tbody>
+</table><h4><a id="LDAP+Authentication+and+Authorization">LDAP Authentication and Authorization</a> <a href="#LDAP+Authentication+and+Authorization"><img src="markbook-section-link.png"/></a></h4><h5><a id="`bin/knoxcli.sh+user-auth-test+[--cluster+c]+[--u+username]+[--p+password]+[--g]+[--d]+[--help]`"><code>bin/knoxcli.sh user-auth-test [--cluster c] [--u username] [--p password] [--g] [--d] [--help]</code></a> <a href="#`bin/knoxcli.sh+user-auth-test+[--cluster+c]+[--u+username]+[--p+password]+[--g]+[--d]+[--help]`"><img src="markbook-section-link.png"/></a></h5><p>This command will test a topology&rsquo;s ability to connect, authenticate, and authorize a user with an LDAP server. The only required argument is the &ndash;cluster argument to specify the name of the topology you wish to use. The topology must be valid (passes validate-topology command). If a &ndash;u and &ndash;p argument are not specified, the command line will prompt for a username and password. If authentication
  is successful then the command will attempt to use the topology to do an LDAP group lookup. The topology must be configured correctly to do this. If it is not, groups will not return and no errors will be printed unless the <code>--g</code> command is specified. Currently this command only works if a topology supports the use of ShiroProvider for authentication.</p>
+<table>
+  <thead>
+    <tr>
+      <th>argument </th>
+      <th>description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>--cluster </td>
+      <td>Required; name of cluster for which you want to test authentication</td>
+    </tr>
+    <tr>
+      <td>--u </td>
+      <td>Optional; username you wish you authenticate with.</td>
+    </tr>
+    <tr>
+      <td>--p </td>
+      <td>Optional; password you wish to authenticate with</td>
+    </tr>
+    <tr>
+      <td>--g </td>
+      <td>Optional; Specify that you are looking to return a user&rsquo;s groups. If not specified, group lookup errors won&rsquo;t return.</td>
+    </tr>
+    <tr>
+      <td>--d </td>
+      <td>Optional; Print extra debug info on failed authentication</td>
+    </tr>
+  </tbody>
+</table><h4><a id="Topology+LDAP+Bind">Topology LDAP Bind</a> <a href="#Topology+LDAP+Bind"><img src="markbook-section-link.png"/></a></h4><h5><a id="`bin/knoxcli.sh+system-user-auth-test+[--cluster+c]+[--d]+[--help]`"><code>bin/knoxcli.sh system-user-auth-test [--cluster c] [--d] [--help]</code></a> <a href="#`bin/knoxcli.sh+system-user-auth-test+[--cluster+c]+[--d]+[--help]`"><img src="markbook-section-link.png"/></a></h5><p>This command will test a given topology&rsquo;s ability to connect, bind, and authenticate with the ldap server from the settings specified in the topology file. The bind currently only will with Shiro as the authentication provider. There are also two parameters required inside of the topology for these </p>
+<table>
+  <thead>
+    <tr>
+      <th>argument </th>
+      <th>description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>--cluster </td>
+      <td>Required; name of cluster for which you want to test authentication</td>
+    </tr>
+    <tr>
+      <td>--d </td>
+      <td>Optional; Print extra debug info on failed authentication</td>
+    </tr>
+  </tbody>
+</table><h4><a id="Gateway+Service+Test">Gateway Service Test</a> <a href="#Gateway+Service+Test"><img src="markbook-section-link.png"/></a></h4><h5><a id="`bin/knoxcli.sh+service-test+[--cluster+c]+[--hostname+hostname]+[--port+port]+[--u+username]+[--p+password]+[--d]+[--help]`"><code>bin/knoxcli.sh service-test [--cluster c] [--hostname hostname] [--port port] [--u username] [--p password] [--d] [--help]</code></a> <a href="#`bin/knoxcli.sh+service-test+[--cluster+c]+[--hostname+hostname]+[--port+port]+[--u+username]+[--p+password]+[--d]+[--help]`"><img src="markbook-section-link.png"/></a></h5><p>This will test a topology configuration&rsquo;s ability to connect to multiple hadoop services. Each service found in a topology will be tested with multiple URLs. Results are printed to the console in JSON format..</p>
+<table>
+  <thead>
+    <tr>
+      <th>argument </th>
+      <th>description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>--cluster </td>
+      <td>Required; name of cluster for which you want to test authentication</td>
+    </tr>
+    <tr>
+      <td>--hostname </td>
+      <td>Required; hostname of the cluster currently running on the machine</td>
+    </tr>
+    <tr>
+      <td>--port </td>
+      <td>Optional; port that the cluster is running on. If not supplied CLI will try to read config files to find the port.</td>
+    </tr>
+    <tr>
+      <td>--u </td>
+      <td>Required; username to authorize against Hadoop services</td>
+    </tr>
+    <tr>
+      <td>--p </td>
+      <td>Required; password to match username</td>
+    </tr>
+    <tr>
+      <td>--d </td>
+      <td>Optional; Print extra debug info on failed authentication</td>
+    </tr>
+  </tbody>
+</table><h3><a id="Admin+API">Admin API</a> <a href="#Admin+API"><img src="markbook-section-link.png"/></a></h3><p>Access to the administrator functions of Knox are provided by the Admin REST API.</p><h4><a id="Admin+API+URL">Admin API URL</a> <a href="#Admin+API+URL"><img src="markbook-section-link.png"/></a></h4><p>The URL mapping for the Knox Admin API is simple:</p>
+<table>
+  <tbody>
+    <tr>
+      <td>Gateway </td>
+      <td><code>https://{gateway-host}:{gateway-port}/{gateway-path}/admin/api/v1</code> </td>
+    </tr>
+  </tbody>
+</table><p>Please note that to access that admin API, the user attempting to connect must have admin credentials inside of the LDAP Server</p><h5><a id="API+Documentation">API Documentation</a> <a href="#API+Documentation"><img src="markbook-section-link.png"/></a></h5><h6><a id="Operations">Operations</a> <a href="#Operations"><img src="markbook-section-link.png"/></a></h6>
+<ul>
+  <li><h6>HTTP GET</h6></li>
+</ul>
+<ol>
+  <li><a href="#Server+Version">Server Version</a></li>
+  <li><a href="#Topology+Collection">Topology Collection</a></li>
+  <li><a href="#Topology">Topology</a></li>
+</ol>
+<ul>
+  <li><h6>HTTP PUT</h6></li>
+  <li><h6>HTTP DELETE</h6></li>
+</ul><h5><a id="Server+Version">Server Version</a> <a href="#Server+Version"><img src="markbook-section-link.png"/></a></h5><h6><a id="Description">Description</a> <a href="#Description"><img src="markbook-section-link.png"/></a></h6><p>Calls to Knox and returns the gateway&rsquo;s current version and the version hash inside of a JSON object. </p><h6><a id="Example+Request+URL">Example Request URL</a> <a href="#Example+Request+URL"><img src="markbook-section-link.png"/></a></h6><p><code>https://{gateway-host}:{gateway-port}/{gateway-path}/admin/api/v1/version</code> </p><h6><a id="Example+cURL+Request">Example cURL Request</a> <a href="#Example+cURL+Request"><img src="markbook-section-link.png"/></a></h6><p><code>curl -u admin:admin-password -i -k https://{gateway-host}:{gateway-port}/{gateway-path}/admin/api/v1/version</code></p><h6><a id="Response">Response</a> <a href="#Response"><img src="markbook-section-link.png"/></a></h6>
+<pre><code>&lt;ServerVersion&gt;
+    &lt;version&gt;{version-number}&lt;/version&gt;
+    &lt;hash&gt;{version-hash}&lt;/hash&gt;
+&lt;/ServerVersion&gt;
+</code></pre><h5><a id="Topology+Collection">Topology Collection</a> <a href="#Topology+Collection"><img src="markbook-section-link.png"/></a></h5><h6><a id="Description">Description</a> <a href="#Description"><img src="markbook-section-link.png"/></a></h6><p>Calls to Knox and return an array of JSON objects that represent the list of deployed topologies currently inside of the gateway. </p><h6><a id="Example+Request+URL">Example Request URL</a> <a href="#Example+Request+URL"><img src="markbook-section-link.png"/></a></h6><p><code>https://{gateway-host}:{gateway-port}/{gateway-path}/admin/api/{api-version}/topologies</code></p><h6><a id="Example+cURL+Request">Example cURL Request</a> <a href="#Example+cURL+Request"><img src="markbook-section-link.png"/></a></h6><p><code>curl -u admin:admin-password -i -k -H Accept:application/json https://{gateway-host}:{gateway-port}/{gateway-path}/admin/api/v1/topologies</code></p><h6><a id="Response">Response</a> <a href="#Response"><img src="mar
 kbook-section-link.png"/></a></h6>
+<pre><code>[  
+  {  
+    &quot;href&quot;:&quot;https://localhost:8443/gateway/admin/api/v1/topologies/_default&quot;,
+    &quot;name&quot;:&quot;_default&quot;,

[... 4072 lines stripped ...]