You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@knox.apache.org by lm...@apache.org on 2017/03/17 01:16:35 UTC

svn commit: r1787275 - in /knox: site/books/knox-0-12-0/user-guide.html trunk/books/0.12.0/book.md trunk/books/0.12.0/book_client-details.md

Author: lmccay
Date: Fri Mar 17 01:16:34 2017
New Revision: 1787275

URL: http://svn.apache.org/viewvc?rev=1787275&view=rev
Log:
Adding docs for KnoxShell to Cllient Details in 0.12.0

Modified:
    knox/site/books/knox-0-12-0/user-guide.html
    knox/trunk/books/0.12.0/book.md
    knox/trunk/books/0.12.0/book_client-details.md

Modified: knox/site/books/knox-0-12-0/user-guide.html
URL: http://svn.apache.org/viewvc/knox/site/books/knox-0-12-0/user-guide.html?rev=1787275&r1=1787274&r2=1787275&view=diff
==============================================================================
--- knox/site/books/knox-0-12-0/user-guide.html (original)
+++ knox/site/books/knox-0-12-0/user-guide.html Fri Mar 17 01:16:34 2017
@@ -67,7 +67,15 @@
   </ul></li>
   <li><a href="#Websocket+Support">Websocket Support</a></li>
   <li><a href="#Audit">Audit</a></li>
-  <li><a href="#Client+Details">Client Details</a></li>
+  <li><a href="#Client+Details">Client Details</a>
+  <ul>
+    <li><a href="#Client+Quickstart">Client Quickstart</a></li>
+    <li><a href="#Client+Token+Sessions">Client Token Sessions</a>
+    <ul>
+      <li><a href="#Server+Setup">Server Setup</a></li>
+    </ul></li>
+    <li><a href="#Client+DSL+and+SDK+Details">Client DSL and SDK Details</a></li>
+  </ul></li>
   <li><a href="#Service+Details">Service Details</a>
   <ul>
     <li><a href="#WebHDFS">WebHDFS</a></li>
@@ -3114,7 +3122,185 @@ APACHE_HOME/bin/apachectl -k stop
       <td>Logging message. Contains additional tracking information.</td>
     </tr>
   </tbody>
-</table><h4><a id="Audit+log+rotation">Audit log rotation</a> <a href="#Audit+log+rotation"><img src="markbook-section-link.png"/></a></h4><p>Audit logging is preconfigured with <code>org.apache.log4j.DailyRollingFileAppender</code>. <a href="http://logging.apache.org/log4j/1.2/">Apache log4j</a> contains information about other Appenders.</p><h4><a id="How+to+change+the+audit+level+or+disable+it">How to change the audit level or disable it</a> <a href="#How+to+change+the+audit+level+or+disable+it"><img src="markbook-section-link.png"/></a></h4><p>All audit messages are logged at <code>INFO</code> level and this behavior can&rsquo;t be changed.</p><p>Disabling auditing can be done by decreasing the log level for the Audit appender or setting it to <code>OFF</code>.</p><h2><a id="Client+Details">Client Details</a> <a href="#Client+Details"><img src="markbook-section-link.png"/></a></h2><p>Hadoop requires a client that can be used to interact remotely with the services provided by Had
 oop cluster. This will also be true when using the Apache Knox Gateway to provide perimeter security and centralized access for these services. The two primary existing clients for Hadoop are the CLI (i.e. Command Line Interface, hadoop) and <a href="http://gethue.com/">Hue</a> (i.e. Hadoop User Experience). For several reasons however, neither of these clients can <em>currently</em> be used to access Hadoop services via the Apache Knox Gateway.</p><p>This led to thinking about a very simple client that could help people use and evaluate the gateway. The list below outlines the general requirements for such a client.</p>
+</table><h4><a id="Audit+log+rotation">Audit log rotation</a> <a href="#Audit+log+rotation"><img src="markbook-section-link.png"/></a></h4><p>Audit logging is preconfigured with <code>org.apache.log4j.DailyRollingFileAppender</code>. <a href="http://logging.apache.org/log4j/1.2/">Apache log4j</a> contains information about other Appenders.</p><h4><a id="How+to+change+the+audit+level+or+disable+it">How to change the audit level or disable it</a> <a href="#How+to+change+the+audit+level+or+disable+it"><img src="markbook-section-link.png"/></a></h4><p>All audit messages are logged at <code>INFO</code> level and this behavior can&rsquo;t be changed.</p><p>Disabling auditing can be done by decreasing the log level for the Audit appender or setting it to <code>OFF</code>.</p><h2><a id="Client+Details">Client Details</a> <a href="#Client+Details"><img src="markbook-section-link.png"/></a></h2><p>The KnoxShell release artifact provides a small footprint client environment that removes all un
 necessary server dependencies, configuration, binary scripts, etc. It is comprised a couple different things that empower different sorts of users.</p>
+<ul>
+  <li>A set of SDK type classes for providing access to Hadoop resources over HTTP</li>
+  <li>A Groovy based DSL for scripting access to Hadoop resources based on the underlying SDK classes</li>
+  <li>A KnoxShell Token based Sessions to provide a CLI SSO session for executing multiple scripts</li>
+</ul><p>The following sections provide an overview and quickstart for the KnoxShell.</p><h3><a id="Client+Quickstart">Client Quickstart</a> <a href="#Client+Quickstart"><img src="markbook-section-link.png"/></a></h3><p>The following installation and setup instructions should get you started with using the KnoxShell very quickly.</p>
+<ol>
+  <li><p>Download a knoxshell-x.x.x.zip or tar file and unzip it in your preferred location {GATEWAY_CLIENT_HOME}</p>
+  <pre><code>home:knoxshell-0.12.0 larry$ ls -l
+total 296
+-rw-r--r--@  1 larry  staff  71714 Mar 14 14:06 LICENSE
+-rw-r--r--@  1 larry  staff    164 Mar 14 14:06 NOTICE
+-rw-r--r--@  1 larry  staff  71714 Mar 15 20:04 README
+drwxr-xr-x@ 12 larry  staff    408 Mar 15 21:24 bin
+drwxr--r--@  3 larry  staff    102 Mar 14 14:06 conf
+drwxr-xr-x+  3 larry  staff    102 Mar 15 12:41 logs
+drwxr-xr-x@ 18 larry  staff    612 Mar 14 14:18 samples
+</code></pre>
+  <table>
+    <thead>
+      <tr>
+        <th>Directory </th>
+        <th>Description </th>
+      </tr>
+    </thead>
+    <tbody>
+      <tr>
+        <td>bin </td>
+        <td>contains the main knoxshell jar and related shell scripts</td>
+      </tr>
+      <tr>
+        <td>conf </td>
+        <td>only contains log4j config</td>
+      </tr>
+      <tr>
+        <td>logs </td>
+        <td>contains the knoxshell.log file</td>
+      </tr>
+      <tr>
+        <td>samples </td>
+        <td>has numerous examples to help you get started</td>
+      </tr>
+    </tbody>
+  </table></li>
+  <li><p>cd {GATEWAY_CLIENT_HOME}</p></li>
+  <li>Get/setup truststore for the target Knox instance or fronting load balancer
+  <ul>
+    <li>if you have access to the server you may use the command knoxcli.sh export-cert &ndash;type JKS</li>
+    <li>copy the resulting gateway-client-identity.jks to your user home directory</li>
+  </ul></li>
+  <li><p>Execute the an example script from the {GATEWAY_CLIENT_HOME}/samples directory - for instance:</p>
+  <ul>
+    <li>bin/knoxshell.sh samples/ExampleWebHdfsLs.groovy</li>
+  </ul>
+  <pre><code>home:knoxshell-0.12.0 larry$ bin/knoxshell.sh samples/ExampleWebHdfsLs.groovy
+Enter username: guest
+Enter password:
+[app-logs, apps, mapred, mr-history, tmp, user]
+</code></pre></li>
+</ol><p>At this point, you should have seen something similar to the above output - probably with different directories listed. You should get the idea from the above. Take a look at the sample that we ran above:</p>
+<pre><code>import groovy.json.JsonSlurper
+import org.apache.hadoop.gateway.shell.Hadoop
+import org.apache.hadoop.gateway.shell.hdfs.Hdfs
+
+import org.apache.hadoop.gateway.shell.Credentials
+
+gateway = &quot;https://localhost:8443/gateway/sandbox&quot;
+
+credentials = new Credentials()
+credentials.add(&quot;ClearInput&quot;, &quot;Enter username: &quot;, &quot;user&quot;)
+                .add(&quot;HiddenInput&quot;, &quot;Enter pas&quot; + &quot;sword: &quot;, &quot;pass&quot;)
+credentials.collect()
+
+username = credentials.get(&quot;user&quot;).string()
+pass = credentials.get(&quot;pass&quot;).string()
+
+session = Hadoop.login( gateway, username, pass )
+
+text = Hdfs.ls( session ).dir( &quot;/&quot; ).now().string
+json = (new JsonSlurper()).parseText( text )
+println json.FileStatuses.FileStatus.pathSuffix
+session.shutdown()
+</code></pre><p>Some things to note about this sample:</p>
+<ol>
+  <li>the gateway URL is hardcoded
+  <ul>
+    <li>alternatives would be passing it as an argument to the script, using an environment variable or prompting for it with a ClearInput credential collector</li>
+  </ul></li>
+  <li>credential collectors are used to gather credentials or other input from various sources. In this sample the HiddenInput and ClearInput collectors prompt the user for the input with the provided prompt text and the values are acquired by a subsequent get call with the provided name value.</li>
+  <li>The Hadoop.login method establishes a login session of sorts which will need to be provided to the various API classes as an argument.</li>
+  <li>the response text is easily retrieved as a string and can be parsed by the JsonSlurper or whatever you like</li>
+</ol><h3><a id="Client+Token+Sessions">Client Token Sessions</a> <a href="#Client+Token+Sessions"><img src="markbook-section-link.png"/></a></h3><p>Building on the Quickstart above we will drill into some of the token session details here and walk through another sample.</p><p>Unlike the quickstart, token sessions require the server to be configured in specific ways to allow the use of token sessions/federation.</p><h4><a id="Server+Setup">Server Setup</a> <a href="#Server+Setup"><img src="markbook-section-link.png"/></a></h4>
+<ol>
+  <li><p>KnoxToken service should be added to your sandbox.xml topology - see the <a href="#KnoxToken+Configuration">KnoxToken Configuration Section</a></p>
+  <pre><code>&lt;service&gt;
+   &lt;role&gt;KNOXTOKEN&lt;/role&gt;
+   &lt;param&gt;
+      &lt;name&gt;knox.token.ttl&lt;/name&gt;
+      &lt;value&gt;36000000&lt;/value&gt;
+   &lt;/param&gt;
+   &lt;param&gt;
+      &lt;name&gt;knox.token.audiences&lt;/name&gt;
+      &lt;value&gt;tokenbased&lt;/value&gt;
+   &lt;/param&gt;
+   &lt;param&gt;
+      &lt;name&gt;knox.token.target.url&lt;/name&gt;
+      &lt;value&gt;https://localhost:8443/gateway/tokenbased&lt;/value&gt;
+   &lt;/param&gt;
+&lt;/service&gt;
+</code></pre></li>
+  <li><p>tokenbased.xml topology to accept tokens as federation tokens for access to exposed resources with JWTProvider <a href="#JWT+Provider">JWT Provider</a></p>
+  <pre><code>&lt;provider&gt;
+   &lt;role&gt;federation&lt;/role&gt;
+   &lt;name&gt;JWTProvider&lt;/name&gt;
+   &lt;enabled&gt;true&lt;/enabled&gt;
+   &lt;param&gt;
+       &lt;name&gt;knox.token.audiences&lt;/name&gt;
+       &lt;value&gt;tokenbased&lt;/value&gt;
+   &lt;/param&gt;
+&lt;/provider&gt;
+</code></pre></li>
+  <li>Use the KnoxShell token commands to establish and manage your session
+  <ul>
+    <li>bin/knoxshell.sh init <a href="https://localhost:8443/gateway/sandbox">https://localhost:8443/gateway/sandbox</a> to acquire a token and cache in user home directory</li>
+    <li>bin/knoxshell.sh list to display the details of the cached token, the expiration time and optionally the target url</li>
+    <li>bin/knoxshell destroy to remove the cached session token and terminate the session</li>
+  </ul></li>
+  <li><p>Execute a script that can take advantage of the token credential collector and target url</p>
+  <pre><code>import groovy.json.JsonSlurper
+import java.util.HashMap
+import java.util.Map
+import org.apache.hadoop.gateway.shell.Credentials
+import org.apache.hadoop.gateway.shell.Hadoop
+import org.apache.hadoop.gateway.shell.hdfs.Hdfs
+
+credentials = new Credentials()
+credentials.add(&quot;KnoxToken&quot;, &quot;none: &quot;, &quot;token&quot;)
+credentials.collect()
+
+token = credentials.get(&quot;token&quot;).string()
+
+gateway = System.getenv(&quot;KNOXSHELL_TOPOLOGY_URL&quot;)
+if (gateway == null || gateway.equals(&quot;&quot;)) {
+  gateway = credentials.get(&quot;token&quot;).getTargetUrl()
+}
+
+println &quot;&quot;
+println &quot;*****************************GATEWAY INSTANCE**********************************&quot;
+println gateway
+println &quot;*******************************************************************************&quot;
+println &quot;&quot;
+
+headers = new HashMap()
+headers.put(&quot;Authorization&quot;, &quot;Bearer &quot; + token)
+
+session = Hadoop.login( gateway, headers )
+
+if (args.length &gt; 0) {
+  dir = args[0]
+} else {
+  dir = &quot;/&quot;
+}
+
+text = Hdfs.ls( session ).dir( dir ).now().string
+json = (new JsonSlurper()).parseText( text )
+statuses = json.get(&quot;FileStatuses&quot;);
+
+println statuses
+
+session.shutdown()
+</code></pre></li>
+</ol><p>Note the following about the above sample script:</p>
+<ol>
+  <li>use of the KnoxToken credential collector</li>
+  <li>use of the targetUrl from the credential collector</li>
+  <li>optional override of the target url with environment variable</li>
+  <li>the passing of the headers map to the session creation in Hadoop.login</li>
+  <li>the passing of an argument for the ls command for the path to list or default to &ldquo;/&rdquo;</li>
+</ol><p>Also note that there is no reason to prompt for username and password as long as the token has not been destroyed or expired. There is also no hardcoded endpoint for using the token - it is specified in the token cache or overridden by environment variable.</p><h2><a id="Client+DSL+and+SDK+Details">Client DSL and SDK Details</a> <a href="#Client+DSL+and+SDK+Details"><img src="markbook-section-link.png"/></a></h2><p>The lack of any formal SDK or client for REST APIs in Hadoop led to thinking about a very simple client that could help people use and evaluate the gateway. The list below outlines the general requirements for such a client.</p>
 <ul>
   <li>Promote the evaluation and adoption of the Apache Knox Gateway</li>
   <li>Simple to deploy and use on data worker desktops for access to remote Hadoop clusters</li>

Modified: knox/trunk/books/0.12.0/book.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/0.12.0/book.md?rev=1787275&r1=1787274&r2=1787275&view=diff
==============================================================================
--- knox/trunk/books/0.12.0/book.md (original)
+++ knox/trunk/books/0.12.0/book.md Fri Mar 17 01:16:34 2017
@@ -68,6 +68,10 @@
 * #[Websocket Support]
 * #[Audit]
 * #[Client Details]
+    * #[Client Quickstart]
+    * #[Client Token Sessions]
+        * #[Server Setup]
+    * #[Client DSL and SDK Details]
 * #[Service Details]
     * #[WebHDFS]
     * #[WebHCat]

Modified: knox/trunk/books/0.12.0/book_client-details.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/0.12.0/book_client-details.md?rev=1787275&r1=1787274&r2=1787275&view=diff
==============================================================================
--- knox/trunk/books/0.12.0/book_client-details.md (original)
+++ knox/trunk/books/0.12.0/book_client-details.md Fri Mar 17 01:16:34 2017
@@ -16,13 +16,180 @@
 --->
 
 ## Client Details ##
+The KnoxShell release artifact provides a small footprint client environment that removes all unnecessary server dependencies, configuration, binary scripts, etc. It is comprised a couple different things that empower different sorts of users.
 
-Hadoop requires a client that can be used to interact remotely with the services provided by Hadoop cluster.
-This will also be true when using the Apache Knox Gateway to provide perimeter security and centralized access for these services.
-The two primary existing clients for Hadoop are the CLI (i.e. Command Line Interface, hadoop) and [Hue](http://gethue.com/) (i.e. Hadoop User Experience).
-For several reasons however, neither of these clients can _currently_ be used to access Hadoop services via the Apache Knox Gateway.
+* A set of SDK type classes for providing access to Hadoop resources over HTTP
+* A Groovy based DSL for scripting access to Hadoop resources based on the underlying SDK classes
+* A KnoxShell Token based Sessions to provide a CLI SSO session for executing multiple scripts
 
-This led to thinking about a very simple client that could help people use and evaluate the gateway.
+The following sections provide an overview and quickstart for the KnoxShell.
+
+### Client Quickstart ###
+The following installation and setup instructions should get you started with using the KnoxShell very quickly.
+
+1. Download a knoxshell-x.x.x.zip or tar file and unzip it in your preferred location {GATEWAY_CLIENT_HOME}
+
+        home:knoxshell-0.12.0 larry$ ls -l
+        total 296
+        -rw-r--r--@  1 larry  staff  71714 Mar 14 14:06 LICENSE
+        -rw-r--r--@  1 larry  staff    164 Mar 14 14:06 NOTICE
+        -rw-r--r--@  1 larry  staff  71714 Mar 15 20:04 README
+        drwxr-xr-x@ 12 larry  staff    408 Mar 15 21:24 bin
+        drwxr--r--@  3 larry  staff    102 Mar 14 14:06 conf
+        drwxr-xr-x+  3 larry  staff    102 Mar 15 12:41 logs
+        drwxr-xr-x@ 18 larry  staff    612 Mar 14 14:18 samples
+        
+    |Directory    | Description |
+    |-------------|-------------|
+    |bin          |contains the main knoxshell jar and related shell scripts|
+    |conf         |only contains log4j config|
+    |logs         |contains the knoxshell.log file|
+    |samples      |has numerous examples to help you get started|
+
+2. cd {GATEWAY_CLIENT_HOME}
+3. Get/setup truststore for the target Knox instance or fronting load balancer
+    - if you have access to the server you may use the command knoxcli.sh export-cert --type JKS
+    - copy the resulting gateway-client-identity.jks to your user home directory
+4. Execute the an example script from the {GATEWAY_CLIENT_HOME}/samples directory - for instance:
+    - bin/knoxshell.sh samples/ExampleWebHdfsLs.groovy
+    
+        home:knoxshell-0.12.0 larry$ bin/knoxshell.sh samples/ExampleWebHdfsLs.groovy
+        Enter username: guest
+        Enter password:
+        [app-logs, apps, mapred, mr-history, tmp, user]
+
+At this point, you should have seen something similar to the above output - probably with different directories listed. You should get the idea from the above. Take a look at the sample that we ran above:
+
+    import groovy.json.JsonSlurper
+    import org.apache.hadoop.gateway.shell.Hadoop
+    import org.apache.hadoop.gateway.shell.hdfs.Hdfs
+
+    import org.apache.hadoop.gateway.shell.Credentials
+
+    gateway = "https://localhost:8443/gateway/sandbox"
+
+    credentials = new Credentials()
+    credentials.add("ClearInput", "Enter username: ", "user")
+                    .add("HiddenInput", "Enter pas" + "sword: ", "pass")
+    credentials.collect()
+
+    username = credentials.get("user").string()
+    pass = credentials.get("pass").string()
+
+    session = Hadoop.login( gateway, username, pass )
+
+    text = Hdfs.ls( session ).dir( "/" ).now().string
+    json = (new JsonSlurper()).parseText( text )
+    println json.FileStatuses.FileStatus.pathSuffix
+    session.shutdown()
+
+Some things to note about this sample:
+
+1. the gateway URL is hardcoded
+    - alternatives would be passing it as an argument to the script, using an environment variable or prompting for it with a ClearInput credential collector
+2. credential collectors are used to gather credentials or other input from various sources. In this sample the HiddenInput and ClearInput collectors prompt the user for the input with the provided prompt text and the values are acquired by a subsequent get call with the provided name value.
+3. The Hadoop.login method establishes a login session of sorts which will need to be provided to the various API classes as an argument.
+4. the response text is easily retrieved as a string and can be parsed by the JsonSlurper or whatever you like
+
+### Client Token Sessions ###
+Building on the Quickstart above we will drill into some of the token session details here and walk through another sample.
+
+Unlike the quickstart, token sessions require the server to be configured in specific ways to allow the use of token sessions/federation.
+
+#### Server Setup ####
+1. KnoxToken service should be added to your sandbox.xml topology - see the [KnoxToken Configuration Section] (#KnoxToken+Configuration)
+
+        <service>
+           <role>KNOXTOKEN</role>
+           <param>
+              <name>knox.token.ttl</name>
+              <value>36000000</value>
+           </param>
+           <param>
+              <name>knox.token.audiences</name>
+              <value>tokenbased</value>
+           </param>
+           <param>
+              <name>knox.token.target.url</name>
+              <value>https://localhost:8443/gateway/tokenbased</value>
+           </param>
+        </service>
+
+2. tokenbased.xml topology to accept tokens as federation tokens for access to exposed resources with JWTProvider [JWT Provider](#JWT+Provider)
+
+        <provider>
+           <role>federation</role>
+           <name>JWTProvider</name>
+           <enabled>true</enabled>
+           <param>
+               <name>knox.token.audiences</name>
+               <value>tokenbased</value>
+           </param>
+        </provider>
+3. Use the KnoxShell token commands to establish and manage your session
+    - bin/knoxshell.sh init https://localhost:8443/gateway/sandbox to acquire a token and cache in user home directory
+    - bin/knoxshell.sh list to display the details of the cached token, the expiration time and optionally the target url
+    - bin/knoxshell destroy to remove the cached session token and terminate the session
+
+4. Execute a script that can take advantage of the token credential collector and target url
+
+        import groovy.json.JsonSlurper
+        import java.util.HashMap
+        import java.util.Map
+        import org.apache.hadoop.gateway.shell.Credentials
+        import org.apache.hadoop.gateway.shell.Hadoop
+        import org.apache.hadoop.gateway.shell.hdfs.Hdfs
+
+        credentials = new Credentials()
+        credentials.add("KnoxToken", "none: ", "token")
+        credentials.collect()
+
+        token = credentials.get("token").string()
+
+        gateway = System.getenv("KNOXSHELL_TOPOLOGY_URL")
+        if (gateway == null || gateway.equals("")) {
+          gateway = credentials.get("token").getTargetUrl()
+        }
+
+        println ""
+        println "*****************************GATEWAY INSTANCE**********************************"
+        println gateway
+        println "*******************************************************************************"
+        println ""
+
+        headers = new HashMap()
+        headers.put("Authorization", "Bearer " + token)
+
+        session = Hadoop.login( gateway, headers )
+
+        if (args.length > 0) {
+          dir = args[0]
+        } else {
+          dir = "/"
+        }
+
+        text = Hdfs.ls( session ).dir( dir ).now().string
+        json = (new JsonSlurper()).parseText( text )
+        statuses = json.get("FileStatuses");
+
+        println statuses
+
+        session.shutdown()
+
+Note the following about the above sample script:
+
+1. use of the KnoxToken credential collector
+2. use of the targetUrl from the credential collector
+3. optional override of the target url with environment variable
+4. the passing of the headers map to the session creation in Hadoop.login
+5. the passing of an argument for the ls command for the path to list or default to "/"
+
+Also note that there is no reason to prompt for username and password as long as the token has not been destroyed or expired.
+There is also no hardcoded endpoint for using the token - it is specified in the token cache or overridden by environment variable.
+
+## Client DSL and SDK Details ##
+
+The lack of any formal SDK or client for REST APIs in Hadoop led to thinking about a very simple client that could help people use and evaluate the gateway.
 The list below outlines the general requirements for such a client.
 
 * Promote the evaluation and adoption of the Apache Knox Gateway