You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@knox.apache.org by mo...@apache.org on 2021/07/20 15:16:30 UTC

svn commit: r1891689 [3/14] - in /knox: site/ site/books/knox-0-12-0/ site/books/knox-0-13-0/ site/books/knox-0-14-0/ site/books/knox-1-0-0/ site/books/knox-1-1-0/ site/books/knox-1-2-0/ site/books/knox-1-3-0/ site/books/knox-1-4-0/ site/books/knox-1-5...

Added: knox/site/books/knox-1-6-0/earth.png
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/earth.png?rev=1891689&view=auto
==============================================================================
Binary file - no diff available.

Propchange: knox/site/books/knox-1-6-0/earth.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: knox/site/books/knox-1-6-0/error.png
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/error.png?rev=1891689&view=auto
==============================================================================
Binary file - no diff available.

Propchange: knox/site/books/knox-1-6-0/error.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: knox/site/books/knox-1-6-0/fs-mount-login-1.png
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/fs-mount-login-1.png?rev=1891689&view=auto
==============================================================================
Binary file - no diff available.

Propchange: knox/site/books/knox-1-6-0/fs-mount-login-1.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: knox/site/books/knox-1-6-0/general_saml_flow.png
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/general_saml_flow.png?rev=1891689&view=auto
==============================================================================
Binary file - no diff available.

Propchange: knox/site/books/knox-1-6-0/general_saml_flow.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: knox/site/books/knox-1-6-0/info.png
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/info.png?rev=1891689&view=auto
==============================================================================
Binary file - no diff available.

Propchange: knox/site/books/knox-1-6-0/info.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: knox/site/books/knox-1-6-0/invalid.png
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/invalid.png?rev=1891689&view=auto
==============================================================================
Binary file - no diff available.

Propchange: knox/site/books/knox-1-6-0/invalid.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: knox/site/books/knox-1-6-0/knox-logo.gif
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/knox-logo.gif?rev=1891689&view=auto
==============================================================================
Binary file - no diff available.

Propchange: knox/site/books/knox-1-6-0/knox-logo.gif
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: knox/site/books/knox-1-6-0/knoxline-splash-2.png
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/knoxline-splash-2.png?rev=1891689&view=auto
==============================================================================
Binary file - no diff available.

Propchange: knox/site/books/knox-1-6-0/knoxline-splash-2.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: knox/site/books/knox-1-6-0/knoxshell-help.png
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/knoxshell-help.png?rev=1891689&view=auto
==============================================================================
Binary file - no diff available.

Propchange: knox/site/books/knox-1-6-0/knoxshell-help.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: knox/site/books/knox-1-6-0/knoxshell_user_guide.html
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/knoxshell_user_guide.html?rev=1891689&view=auto
==============================================================================
--- knox/site/books/knox-1-6-0/knoxshell_user_guide.html (added)
+++ knox/site/books/knox-1-6-0/knoxshell_user_guide.html Tue Jul 20 15:16:28 2021
@@ -0,0 +1,241 @@
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       https://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+<link href="book.css" rel="stylesheet"/>
+<img src="knox-logo.gif" alt="Knox"/>
+<img src="apache-logo.gif" align="right" alt="Apache"/>
+<h1><a id="Apache+Knox+-+KnoxShell+1.5.x+User+Guide">Apache Knox - KnoxShell 1.5.x User Guide</a> <a href="#Apache+Knox+-+KnoxShell+1.5.x+User+Guide"><img src="markbook-section-link.png"/></a></h1>
+<ul>
+  <li><a href="#Introduction">Introduction</a></li>
+  <li><a href="#Representing+and+Working+with+Tabular+Data">Representing and Working with Tabular Data</a></li>
+  <li><a href="#KnoxShellTable">KnoxShellTable</a>
+    <ul>
+      <li><a href="#Builders">Builders</a></li>
+    </ul>
+  </li>
+  <li><a href="#Usecases">Usecases</a>
+    <ul>
+      <li><a href="#JDBC+Resultset+Representations">JDBC Resultset Representations</a></li>
+      <li><a href="#CSV+Representations">CSV Representations</a></li>
+      <li><a href="#General+Table+Operations">General Table Operations</a></li>
+      <li><a href="#Persistence+and+Publishing">Persistence and Publishing</a></li>
+      <li><a href="#KnoxLine+SQL+Shell">KnoxLine SQL Shell</a></li>
+      <li><a href="#Custom+GroovySh+Commands">Custom GroovySh Commands</a></li>
+    </ul>
+  </li>
+  <li><a href="#JDBC+Resultset+Representations">JDBC Resultset Representations</a></li>
+  <li><a href="#CSV+Representations">CSV Representations</a></li>
+  <li><a href="#General+Table+Operations">General Table Operations</a>
+    <ul>
+      <li><a href="#Sorting">Sorting</a></li>
+      <li><a href="#Selecting">Selecting</a></li>
+      <li><a href="#Filtering">Filtering</a></li>
+      <li><a href="#Fluent+API">Fluent API</a></li>
+      <li><a href="#Aggregating">Aggregating</a></li>
+    </ul>
+  </li>
+  <li><a href="#KnoxLine+SQL+Shell">KnoxLine SQL Shell</a></li>
+  <li><a href="#Custom+GroovySh+Commands">Custom GroovySh Commands</a>
+    <ul>
+      <li><a href="#KnoxShell+Commands:">KnoxShell Commands:</a></li>
+    </ul>
+  </li>
+  <li><a href="#EXAMPLE:+COVID19+Data+Flow+into+DataLake">EXAMPLE: COVID19 Data Flow into DataLake</a>
+    <ul>
+      <li><a href="#Build+Table+from+Public+CSV+File">Build Table from Public CSV File</a></li>
+      <li><a href="#Select+Columns,+Filter+and+Sort+by+Column">Select Columns, Filter and Sort by Column</a></li>
+      <li><a href="#Aggregate+Calculations+on+Columns+of+Table">Aggregate Calculations on Columns of Table</a></li>
+      <li><a href="#Persist+Tables+to+Local+Disk">Persist Tables to Local Disk</a></li>
+      <li><a href="#Add+Tables+to+DataLake">Add Tables to DataLake</a></li>
+      <li><a href="#Building+KnoxShell+Truststore">Building KnoxShell Truststore</a></li>
+      <li><a href="#Mount+a+WebHDFS+Filesystem">Mount a WebHDFS Filesystem</a></li>
+      <li><a href="#Accessing+a+Filesystem">Accessing a Filesystem</a></li>
+      <li><a href="#Put+Tables+into+DataLake">Put Tables into DataLake</a></li>
+      <li><a href="#Pull+CSV+Files+from+WebHDFS+and+Create+Tables">Pull CSV Files from WebHDFS and Create Tables</a></li>
+    </ul>
+  </li>
+</ul>
+<h2><a id="Introduction">Introduction</a> <a href="#Introduction"><img src="markbook-section-link.png"/></a></h2>
+<p>The KnoxShell environment has been extended to provide more of an interactive experience through the use of custom commands and the newly added KnoxShellTable rendering and dataset representation class. This is provided through by integrating the power of groovysh extensions and the KnoxShell client classes/SDK and make for some really powerful command line capabilities that would otherwise require the user to SSH to a node within the cluster and use CLIs of different tools or components.</p>
+<p>This document will cover the various KnoxShell extentions and how to use them on their own and describe combinations of them as flows for working with tabular data from various sources.</p>
+<h2><a id="Representing+and+Working+with+Tabular+Data">Representing and Working with Tabular Data</a> <a href="#Representing+and+Working+with+Tabular+Data"><img src="markbook-section-link.png"/></a></h2>
+<p>The ability to read, write and work with tabular data formats such as CSV files, JDBC resultsets and others is core to the motivations of this KnoxShell oriented work. Intentions include: the ability to read arbitrary data from sources from inside a proxied cluster or from external sources, the ability to render the resulting tables, sort the table, filter it for specific subsets of the data and do some interesting calculations that can provide simple insights into your data.</p>
+<p>KnoxShellTable represents those core capabilties with its simple representation of a table, operation methods and builder classes.</p>
+<h2><a id="KnoxShellTable">KnoxShellTable</a> <a href="#KnoxShellTable"><img src="markbook-section-link.png"/></a></h2>
+<p>KnoxShellTable has a number of dedicated builders that have a fluent API for building table representations from various sources.</p>
+<h3><a id="Builders">Builders</a> <a href="#Builders"><img src="markbook-section-link.png"/></a></h3>
+<p>The following builders aid in the creation of tables from various types of data sources.</p>
+<h4><a id="JDBC">JDBC</a> <a href="#JDBC"><img src="markbook-section-link.png"/></a></h4>
+<pre><code><br/>    ports = KnoxShellTable.builder().jdbc().
+      connect(&quot;jdbc:hive2://knox-host:8443/;ssl=true;transportMode=http;httpPath=topology/cdp-proxy-api/hive&quot;).
+      driver(&quot;org.apache.hive.jdbc.HiveDriver&quot;).
+      username(&quot;lmccay&quot;).pwd(&quot;xxxx&quot;).
+      sql(&quot;select * FROM ports&quot;);
+</code></pre>
+<p>Running the above within KnoxShell will submit the provided SQL to HS2, create and assign a new KnoxShellTable instance to the &ldquo;ports&rdquo; variable representing the border ports of entry data.</p>
+<h4><a id="CSV">CSV</a> <a href="#CSV"><img src="markbook-section-link.png"/></a></h4>
+<pre><code>crossings = KnoxShellTable.builder().csv().
+	withHeaders().
+	url(&quot;file:///home/lmccay/Border_Crossing_Entry_Data.csv&quot;)
+</code></pre>
+<p>Running the above within KnoxShell will import a CSV file from local disk, create and assign a new KnoxShellTable instance to the &ldquo;result&rdquo; variable.</p>
+<p>A higher level KnoxShell Custom Command allows for easier use of the builder through more natural syntax and hides the use of the lower level classes and syntax.</p>
+<h4><a id="Join">Join</a> <a href="#Join"><img src="markbook-section-link.png"/></a></h4>
+<pre><code>crossings = KnoxShellTable.builder().join().
+  left(ports).
+  right(crossings).
+  on(&quot;code&quot;,&quot;Port Code&quot;
+</code></pre>
+<p>Running the above within KnoxShell will import a join the two tables with a simple match of the values in left and right tables on each row that matches.</p>
+<h4><a id="JSON">JSON</a> <a href="#JSON"><img src="markbook-section-link.png"/></a></h4>
+<pre><code>tornados = KnoxShellTable.builder().json().
+  url(&quot;file:///home/lmccay/.knoxshell/.tables/tornados.json&quot;)
+</code></pre>
+<p>Running the above within KnoxShell will rematerialize a table that was persisted as JSON and assign it to a local &ldquo;tornados&rdquo; variable.</p>
+<h4><a id="Persistence+and+Publishing">Persistence and Publishing</a> <a href="#Persistence+and+Publishing"><img src="markbook-section-link.png"/></a></h4>
+<p>Being able to create tables, combine them with other datasets, filter them and add new cols based on calculations between cols, etc is all great for creating tables in memory and working with them.</p>
+<p>We also want to be able to persist these tables in a KnoxShellTable canonical JSON format of its own and be able to reload the same datasets later.</p>
+<p>We also want to be able to take a given dataset and publish it as a brand new CSV file that can be pushed into HDFS, saved to local disk, written to cloud storage, etc.</p>
+<p>In addition, we may want to be able to write it directly to Hive or another JDBC datasource.</p>
+<h5><a id="JSON">JSON</a> <a href="#JSON"><img src="markbook-section-link.png"/></a></h5>
+<pre><code>tornados.toJSON()
+</code></pre>
+<p>The above will return and render a JSON representation of the tornados KnoxShellTable including: headers, rows, optionally title and optionally callHistory.</p>
+<h5><a id="CSV">CSV</a> <a href="#CSV"><img src="markbook-section-link.png"/></a></h5>
+<pre><code>tornados.toCSV()
+</code></pre>
+<p>The above will return and render a CSV representation of the tornados KnoxShellTable including: headers (if present), and all rows.</p>
+<p>Note that title and callhistory which are KnoxShellTable specifics are excluded and lost unless also saved as JSON.</p>
+<h2><a id="Usecases">Usecases</a> <a href="#Usecases"><img src="markbook-section-link.png"/></a></h2>
+<ul>
+  <li>JDBC Resultset Representations</li>
+  <li>CSV Representations</li>
+  <li>General Table Operations</li>
+  <li>Joining</li>
+  <li>Sorting, Selecting, Filtering, Calculations</li>
+  <li>Persistence and Publishing</li>
+  <li>KnoxLine SQL Shell</li>
+  <li>Custom GroovySh Commands</li>
+</ul>
+<p>Let&rsquo;s take a look at each usecase.</p>
+<h2><a id="JDBC+Resultset+Representations">JDBC Resultset Representations</a> <a href="#JDBC+Resultset+Representations"><img src="markbook-section-link.png"/></a></h2>
+<p>KnoxLine SQL Client requires a tabular representation of the data from a SQL/JDBC Resultset. This requirement led to the creation of the KnoxShellTable JDBC Builder. It may be used outside of KnoxLine within your own Java clients or groovy scripts leveraging the KnoxShell classes.</p>
+<pre><code>ports = KnoxShellTable.builder().jdbc().
+  connect(&quot;jdbc:hive2://knox-host:8443/;ssl=true;transportMode=http;httpPath=topology/datalake-api/hive&quot;).
+  driver(&quot;org.apache.hive.jdbc.HiveDriver&quot;).
+  username(&quot;lmccay&quot;).pwd(&quot;xxxx&quot;).
+  sql(&quot;select * FROM ports&quot;);
+</code></pre>
+<p>It can create the cols based on the metadata of the resultset and accurately represent the data and perform type specific operations, sorts, etc.</p>
+<p>A higher level KnoxShell Custom Command allows for the use of this builder with Datasources that are managed within the KnoxShell environment and persisted to the users&rsquo; home directory to allow continued use across sessions. This command hides the use of the underlying classes and syntax and allows the user to concentrate on SQL.</p>
+<h2><a id="CSV+Representations">CSV Representations</a> <a href="#CSV+Representations"><img src="markbook-section-link.png"/></a></h2>
+<p>Another dedicated table builder is provided for creating a table from a CSV file that is imported via URL.</p>
+<p>Combined with all the general table operations and ability to join them with other KnoxShellTable representations, this allows for CSV data to be combined with JDBC datasets, filtered and republished as a new dataset or report to be rendered or even reexecuted later.</p>
+<h2><a id="General+Table+Operations">General Table Operations</a> <a href="#General+Table+Operations"><img src="markbook-section-link.png"/></a></h2>
+<p>In addition to the builders described above, there are a number of operations that may be executed on the table itself.</p>
+<h3><a id="Sorting">Sorting</a> <a href="#Sorting"><img src="markbook-section-link.png"/></a></h3>
+<pre><code>tornados.sort(&quot;state&quot;)
+</code></pre>
+<p>When a column is of String type values but they are numerics, you may also sort numerically.</p>
+<pre><code>tornados.sortNumeric(&quot;count&quot;)
+</code></pre>
+<p>The above will sort the tornados table by the &ldquo;state&rdquo; column.</p>
+<h3><a id="Selecting">Selecting</a> <a href="#Selecting"><img src="markbook-section-link.png"/></a></h3>
+<pre><code>tornados.select(&quot;state,cat,inj,fat,date,month,day,year&quot;)
+</code></pre>
+<p>The above will return and render a new table with only the subset of cols selected.</p>
+<h3><a id="Filtering">Filtering</a> <a href="#Filtering"><img src="markbook-section-link.png"/></a></h3>
+<pre><code>tornados.filter().name(&quot;fat&quot;).greaterThan(0)
+</code></pre>
+<p>The above will return and render a table with only those tornados that resulted in one or more fatalities.</p>
+<h3><a id="Fluent+API">Fluent API</a> <a href="#Fluent+API"><img src="markbook-section-link.png"/></a></h3>
+<p>The above operations can be combined in a natural, fluent manner</p>
+<pre><code>tornados.select(&quot;state,cat,inj,fat,date,month,day,year&quot;).
+
+  filter().name(&quot;fat&quot;).greaterThan(0).
+
+  sort(&quot;state&quot;)
+</code></pre>
+<h3><a id="Aggregating">Aggregating</a> <a href="#Aggregating"><img src="markbook-section-link.png"/></a></h3>
+<p>The following method allows for the use of table column calculations to build an aggregate view of helpful calculations for multiple columns in a table and summarizes them in a new table representation.</p>
+<pre><code>table.aggregate().columns(&quot;col1, col2, col3&quot;).functions(&quot;min,max,mean,median,mode,sum&quot;)
+</code></pre>
+<p>The above allows you to combine them by streaming them into each other in one line the select of only certain cols, the filtering of only those events with more than 0 fatalities and the much more efficient sort of the resulting table.</p>
+<h2><a id="KnoxLine+SQL+Shell">KnoxLine SQL Shell</a> <a href="#KnoxLine+SQL+Shell"><img src="markbook-section-link.png"/></a></h2>
+<p>KnoxLine is a beeline like facility built into the KnoxShell client toolbox with basic datasource management and simple SQL client capabilities. ResultSets are rendered via KnoxShellTable but further table based manipulations are not available within the knoxline shell. This is purely dedicated to SQL interactions and table renderings.</p>
+<p>For leveraging the SQL builder of KnoxShellTable to be able to operate on the results locally, see the custom KnoxShell command &lsquo;SQL&rsquo;.</p>
+<p><img src="knoxline-splash-2.png" /></p>
+<p>Once connected to the datasource, SQL commands may be invoked via the command line directly.</p>
+<h2><a id="Custom+GroovySh+Commands">Custom GroovySh Commands</a> <a href="#Custom+GroovySh+Commands"><img src="markbook-section-link.png"/></a></h2>
+<p>Groovy shell has the ability to extend the commands available to help automate scripting or coding that you would otherwise need to do programmatically over and over.</p>
+<p>By providing custom commands for KnoxShellTable operations, builders and manipulation we can greatly simplify what would need to be done with the fluent API of KnoxShellTable and groovy/java code for saving state, etc.</p>
+<h3><a id="KnoxShell+Commands:">KnoxShell Commands:</a> <a href="#KnoxShell+Commands:"><img src="markbook-section-link.png"/></a></h3>
+<ol>
+  <li><strong>Datasources</strong> (:datasource|:ds) CRUD and select operations for a set of JDBC datasources that are persisted to disk (KNOX-2128)</li>
+  <li><strong>SQL</strong> (:SQL|:sql) SQL query execution with persisted SQL history per datasource (KNOX-2128)</li>
+  <li><strong>CSV</strong> (:CSV|:csv) Import and Export from CSV and JSON formats</li>
+  <li><strong>Filesystem</strong> (:Filesystem|:fs) POSIX style commands for HDFS and cloud storage (mount, unmount, mounts, ls, rm, mkdir, cat, put, etc)</li>
+</ol>
+<p><img src="knoxshell-help.png" /></p>
+<h2><a id="EXAMPLE:+COVID19+Data+Flow+into+DataLake">EXAMPLE: COVID19 Data Flow into DataLake</a> <a href="#EXAMPLE:+COVID19+Data+Flow+into+DataLake"><img src="markbook-section-link.png"/></a></h2>
+<p>Let&rsquo;s start to put the commands and table capabilities together to consume some public tabular data and usher it into our datalake or cluster.</p>
+<h3><a id="Build+Table+from+Public+CSV+File">Build Table from Public CSV File</a> <a href="#Build+Table+from+Public+CSV+File"><img src="markbook-section-link.png"/></a></h3>
+<p><img src="covid19csv-1.png" /></p>
+<p>The use of the CSV KnoxShell command above can be easily correlated to the CSV builder of KnoxShellTable. It is obviously less verbose and more natural than using the fluent API of KnoxShellTable directly and also leverages a separate capability for KnoxShell to assign the resulting table to a KnoxShell variable that can be references and manipulated afterward.</p>
+<p>As you can see the result of creating the table from a CSV file is a rendering of the entire table and often does not fit the screen propertly. This is where the operations on the resulting table come in handy for explorer the dataset. Let&rsquo;s filter the above dataset of COVID19 across the world to only a subset of columns and for only New Jersey by selecting, filtering and sorting numerically by number of Confirmed cases.</p>
+<h3><a id="Select+Columns,+Filter+and+Sort+by+Column">Select Columns, Filter and Sort by Column</a> <a href="#Select+Columns,+Filter+and+Sort+by+Column"><img src="markbook-section-link.png"/></a></h3>
+<p>First we will interrogate the table for its column names or headers. Then we will select only those columns that we want in order to fit it to the screen, filter it for only New Jersey information and sort numerically by the number of Confirmed cases per county.</p>
+<p><img src="covid19nj-1.png" alt="COVID19NJ-1" /></p>
+<p>From the above operation, we can now see the COVID19 data for New Jersey counties for 4/10/2020 sorted by the number of Confirmed cases and the subset of cols of the most interest and tailored to fit our screen. From the above table, we can visually see a number of insights in terms of the most affected counties across the state of New Jersey but it may be more interesting to be able to see an aggregation of some of the calculations available for numeric columns through KnoxShellTable. Let&rsquo;s take a look at an aggregate table for this dataset.</p>
+<h3><a id="Aggregate+Calculations+on+Columns+of+Table">Aggregate Calculations on Columns of Table</a> <a href="#Aggregate+Calculations+on+Columns+of+Table"><img src="markbook-section-link.png"/></a></h3>
+<p>Since the KnoxShellTable fluent API allows us to chain such operations together easily, we will just hit the up arrow to get the previous table operation command and add the aggregate operation to the chain.</p>
+<p><img src="covid19nj-aggregate-1.png" /></p>
+<p>Now, by using both tables above, we can see that my county of Camden is both visually in approximately the center of the counties in terms of Confirmed case numbers but how it stands related to both the average and the median calculations. You can also see the sum of all of New Jersey and the number of those that belong to my county.</p>
+<h3><a id="Persist+Tables+to+Local+Disk">Persist Tables to Local Disk</a> <a href="#Persist+Tables+to+Local+Disk"><img src="markbook-section-link.png"/></a></h3>
+<p>Next, we will persist these tables to our local disk and then push them into our HDFS based datalake for access by cluster resources and other users.</p>
+<p><img src="covid19-persistence.png" /></p>
+<h3><a id="Add+Tables+to+DataLake">Add Tables to DataLake</a> <a href="#Add+Tables+to+DataLake"><img src="markbook-section-link.png"/></a></h3>
+<p>Now that we have these tables persisted to local disk, we can use our KnoxShell Filesystem commands to add them to the datalake.</p>
+<h3><a id="Building+KnoxShell+Truststore">Building KnoxShell Truststore</a> <a href="#Building+KnoxShell+Truststore"><img src="markbook-section-link.png"/></a></h3>
+<p>Before we can access resources from datalake behind Knox we need to insure that the cert presented by the Knox instance is trusted. If the deployment is using certs signed by a well-known ca, then we generally don&rsquo;t have to do anything. If we are using Knox self-signed certs or certs signed by an internal ca of some sort then we must import them into the KnoxShell truststore. While this can be located in arbitrary places and configured via system properties and environment variables, the most common approach is to use then default location.</p>
+<pre><code>// exit knoxshell
+^C
+
+bin/knoxshell.sh buildTrustStore https://nightly7x-1.nightly7x.root.hwx.site:8443/gateway/datalake-api
+
+ls -l ~/gateway-client-trust.jks
+
+// to reenter knoxshell
+bin/knoxshell.sh
+</code></pre>
+<p>With the default password of &lsquo;changeit&rsquo;.</p>
+<h3><a id="Mount+a+WebHDFS+Filesystem">Mount a WebHDFS Filesystem</a> <a href="#Mount+a+WebHDFS+Filesystem"><img src="markbook-section-link.png"/></a></h3>
+<p>We may now mount a filesystem from the remote Knox instance by mounting the topology that hosts the WebHDFS API endpoint.</p>
+<pre><code>:fs mount https://nightly7x-1.nightly7x.root.hwx.site:8443/gateway/datalake-api nightly
+</code></pre>
+<h3><a id="Accessing+a+Filesystem">Accessing a Filesystem</a> <a href="#Accessing+a+Filesystem"><img src="markbook-section-link.png"/></a></h3>
+<p>Once we have the desired mount, we may now access it by specifying the mountpoint name as the path prefix into the HDFS filesystem. Upon mounting or first access, the KnoxShell will prompt for user credentials for use as HTTP Basic credentials while accessing WebHDFS API.</p>
+<p><img src="fs-mount-login-1.png" /></p>
+<p>Once we authenticate to the mounted filesystem, we reference it by mountpoint and never concern ourselves with the actual URL to the endpoint.</p>
+<h3><a id="Put+Tables+into+DataLake">Put Tables into DataLake</a> <a href="#Put+Tables+into+DataLake"><img src="markbook-section-link.png"/></a></h3>
+<p><img src="covid19nj-put-webhdfs-1.png" /></p>
+<p>Above, we have put the previously persisted CSV files into the tmp directory of the mounted filesystem to be available to other datalake users.</p>
+<p>We can now also access them from any other KnoxShell instance that has mounted this filesystem with appropriate credentials. Let&rsquo;s now cat the contents of one of the CSV files into the KnoxShell and then render it as a table from the raw CSV format.</p>
+<h3><a id="Pull+CSV+Files+from+WebHDFS+and+Create+Tables">Pull CSV Files from WebHDFS and Create Tables</a> <a href="#Pull+CSV+Files+from+WebHDFS+and+Create+Tables"><img src="markbook-section-link.png"/></a></h3>
+<p><img src="covid19-nj-agg-from-webhdfs-1.png" /></p>
+<p>Note that the cat command returns the CSV file contents as a string to the KnoxShell environment as a variable called &rsquo;_&rsquo; .</p>
+<p>This is true of any command in groovysh or KnoxShell. The previous result is always available as this variable. Here we pass the contents of the variable to the CSV KnoxShellTable builder string() method. This is a very convenient way to render tabular data from a cat&rsquo;d file from your remote datalake. </p>
+<p>Also note that tables that are assigned to variables within KnoxShell will render themselves just by typing the variable name.</p>
\ No newline at end of file

Added: knox/site/books/knox-1-6-0/knoxsso_integration.html
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/knoxsso_integration.html?rev=1891689&view=auto
==============================================================================
--- knox/site/books/knox-1-6-0/knoxsso_integration.html (added)
+++ knox/site/books/knox-1-6-0/knoxsso_integration.html Tue Jul 20 15:16:28 2021
@@ -0,0 +1,571 @@
+<h1>Knox SSO Integration for UIs</h1>
+<h2>Introduction</h2>
+<p>KnoxSSO provides an abstraction for integrating any number of authentication systems and SSO solutions and enables participating web applications to scale to those solutions more easily. Without the token exchange capabilities offered by KnoxSSO each component UI would need to integrate with each desired solution on its own. </p>
+<p>This document examines the way to integrate with Knox SSO in the form of a Servlet Filter. This approach should be easily extrapolated into other frameworks - ie. Spring Security.</p>
+<h3><a id="General+Flow">General Flow</a> <a href="#General+Flow"><img src="markbook-section-link.png"/></a></h3>
+<p>The following is a generic sequence diagram for SAML integration through KnoxSSO.</p>
+<img src='general_saml_flow.png'/>
+<h4><a id="KnoxSSO+Setup">KnoxSSO Setup</a> <a href="#KnoxSSO+Setup"><img src="markbook-section-link.png"/></a></h4>
+<h5><a id="knoxsso.xml+Topology">knoxsso.xml Topology</a> <a href="#knoxsso.xml+Topology"><img src="markbook-section-link.png"/></a></h5>
+<p>In order to enable KnoxSSO, we need to configure the IdP topology. The following is an example of this topology that is configured to use HTTP Basic Auth against the Knox Demo LDAP server. This is the lowest barrier of entry for your development environment that actually authenticates against a real user store. What’s great is if you work against the IdP with Basic Auth then you will work with SAML or anything else as well.</p>
+<pre><code>		&lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?&gt;
+		&lt;topology&gt;
+    		&lt;gateway&gt;
+        		&lt;provider&gt;
+            		&lt;role&gt;authentication&lt;/role&gt;
+            		&lt;name&gt;ShiroProvider&lt;/name&gt;
+            		&lt;enabled&gt;true&lt;/enabled&gt;
+            		&lt;param&gt;
+	                	&lt;name&gt;sessionTimeout&lt;/name&gt;
+                		&lt;value&gt;30&lt;/value&gt;
+            		&lt;/param&gt;
+            		&lt;param&gt;
+                		&lt;name&gt;main.ldapRealm&lt;/name&gt;
+                		&lt;value&gt;org.apache.knox.gateway.shirorealm.KnoxLdapRealm&lt;/value&gt;
+            		&lt;/param&gt;
+            		&lt;param&gt;
+                		&lt;name&gt;main.ldapContextFactory&lt;/name&gt;
+                		&lt;value&gt;org.apache.knox.gateway.shirorealm.KnoxLdapContextFactory&lt;/value&gt;
+            		&lt;/param&gt;
+            		&lt;param&gt;
+                		&lt;name&gt;main.ldapRealm.contextFactory&lt;/name&gt;
+                		&lt;value&gt;$ldapContextFactory&lt;/value&gt;
+            		&lt;/param&gt;
+            		&lt;param&gt;
+                		&lt;name&gt;main.ldapRealm.userDnTemplate&lt;/name&gt;
+                		&lt;value&gt;uid={0},ou=people,dc=hadoop,dc=apache,dc=org&lt;/value&gt;
+            		&lt;/param&gt;
+            		&lt;param&gt;
+                		&lt;name&gt;main.ldapRealm.contextFactory.url&lt;/name&gt;
+                		&lt;value&gt;ldap://localhost:33389&lt;/value&gt;
+            		&lt;/param&gt;
+            		&lt;param&gt;
+                		&lt;name&gt;main.ldapRealm.contextFactory.authenticationMechanism&lt;/name&gt;
+                		&lt;value&gt;simple&lt;/value&gt;
+            		&lt;/param&gt;
+            		&lt;param&gt;
+                		&lt;name&gt;urls./**&lt;/name&gt;
+                		&lt;value&gt;authcBasic&lt;/value&gt;
+            		&lt;/param&gt;
+        		&lt;/provider&gt;
+            &lt;provider&gt;
+        		    &lt;role&gt;identity-assertion&lt;/role&gt;
+            		&lt;name&gt;Default&lt;/name&gt;
+            		&lt;enabled&gt;true&lt;/enabled&gt;
+        		&lt;/provider&gt;
+    		&lt;/gateway&gt;
+        &lt;service&gt;
+        		&lt;role&gt;KNOXSSO&lt;/role&gt;
+        		&lt;param&gt;
+          			&lt;name&gt;knoxsso.cookie.secure.only&lt;/name&gt;
+          			&lt;value&gt;true&lt;/value&gt;
+        		&lt;/param&gt;
+        		&lt;param&gt;
+          			&lt;name&gt;knoxsso.token.ttl&lt;/name&gt;
+          			&lt;value&gt;100000&lt;/value&gt;
+        		&lt;/param&gt;
+    		&lt;/service&gt;
+		&lt;/topology&gt;
+</code></pre>
+<p>Just as with any Knox service, the KNOXSSO service is protected by the gateway providers defined above it. In this case, the ShiroProvider is taking care of HTTP Basic Auth against LDAP for us. Once the user authenticates the request processing continues to the KNOXSSO service that will create the required cookie and do the necessary redirects.</p>
+<p>The authenticate/federation provider can be swapped out to fit your deployment environment.</p>
+<h5><a id="sandbox.xml+Topology">sandbox.xml Topology</a> <a href="#sandbox.xml+Topology"><img src="markbook-section-link.png"/></a></h5>
+<p>In order to see the end to end story and use it as an example in your development, you can configure one of the cluster topologies to use the SSOCookieProvider instead of the out of the box ShiroProvider. The following is an example sandbox.xml topology that is configured for using KnoxSSO to protect access to the Hadoop REST APIs.</p>
+<pre><code>	&lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?&gt;
+&lt;topology&gt;
+  &lt;gateway&gt;
+    &lt;provider&gt;
+        &lt;role&gt;federation&lt;/role&gt;
+        &lt;name&gt;SSOCookieProvider&lt;/name&gt;
+        &lt;enabled&gt;true&lt;/enabled&gt;
+        &lt;param&gt;
+            &lt;name&gt;sso.authentication.provider.url&lt;/name&gt;
+            &lt;value&gt;https://localhost:9443/gateway/idp/api/v1/websso&lt;/value&gt;
+        &lt;/param&gt;
+    &lt;/provider&gt;
+    &lt;provider&gt;
+        &lt;role&gt;identity-assertion&lt;/role&gt;
+        &lt;name&gt;Default&lt;/name&gt;
+        &lt;enabled&gt;true&lt;/enabled&gt;
+    &lt;/provider&gt;
+  &lt;/gateway&gt;    
+  &lt;service&gt;
+      &lt;role&gt;NAMENODE&lt;/role&gt;
+      &lt;url&gt;hdfs://localhost:8020&lt;/url&gt;
+  &lt;/service&gt;
+  &lt;service&gt;
+      &lt;role&gt;JOBTRACKER&lt;/role&gt;
+      &lt;url&gt;rpc://localhost:8050&lt;/url&gt;
+  &lt;/service&gt;
+  &lt;service&gt;
+      &lt;role&gt;WEBHDFS&lt;/role&gt;
+      &lt;url&gt;http://localhost:50070/webhdfs&lt;/url&gt;
+  &lt;/service&gt;
+  &lt;service&gt;
+      &lt;role&gt;WEBHCAT&lt;/role&gt;
+      &lt;url&gt;http://localhost:50111/templeton&lt;/url&gt;
+  &lt;/service&gt;
+  &lt;service&gt;
+      &lt;role&gt;OOZIE&lt;/role&gt;
+      &lt;url&gt;http://localhost:11000/oozie&lt;/url&gt;
+  &lt;/service&gt;
+  &lt;service&gt;
+      &lt;role&gt;WEBHBASE&lt;/role&gt;
+      &lt;url&gt;http://localhost:60080&lt;/url&gt;
+  &lt;/service&gt;
+  &lt;service&gt;
+      &lt;role&gt;HIVE&lt;/role&gt;
+      &lt;url&gt;http://localhost:10001/cliservice&lt;/url&gt;
+  &lt;/service&gt;
+  &lt;service&gt;
+      &lt;role&gt;RESOURCEMANAGER&lt;/role&gt;
+      &lt;url&gt;http://localhost:8088/ws&lt;/url&gt;
+  &lt;/service&gt;
+&lt;/topology&gt;
+</code></pre>
+<ul>
+  <li>NOTE: Be aware that when using Chrome as your browser that cookies don’t seem to work for “localhost”. Either use a VM or like I did - use 127.0.0.1. Safari works with localhost without problems.</li>
+</ul>
+<p>As you can see above, the only thing being configured is the SSO provider URL. Since Knox is the issuer of the cookie and token, we don’t need to configure the public key since we have programmatic access to the actual keystore for use at verification time.</p>
+<h4><a id="Curl+the+Flow">Curl the Flow</a> <a href="#Curl+the+Flow"><img src="markbook-section-link.png"/></a></h4>
+<p>We should now be able to walk through the SSO Flow at the command line with curl to see everything that happens.</p>
+<p>First, issue a request to WEBHDFS through knox.</p>
+<pre><code>	bash-3.2$ curl -iku guest:guest-password https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp?op+LISTSTATUS
+	
+	HTTP/1.1 302 Found
+	Location: https://localhost:8443/gateway/idp/api/v1/websso?originalUrl=https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp?op+LISTSTATUS
+	Content-Length: 0
+	Server: Jetty(8.1.14.v20131031)
+</code></pre>
+<p>Note the redirect to the knoxsso endpoint and the loginUrl with the originalUrl request parameter. We need to see that come from your integration as well.</p>
+<p>Let’s manually follow that redirect with curl now:</p>
+<pre><code>	bash-3.2$ curl -iku guest:guest-password &quot;https://localhost:8443/gateway/idp/api/v1/websso?originalUrl=https://localhost:9443/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS&quot;
+
+	HTTP/1.1 307 Temporary Redirect
+	Set-Cookie: JSESSIONID=mlkda4crv7z01jd0q0668nsxp;Path=/gateway/idp;Secure;HttpOnly
+	Set-Cookie: hadoop-jwt=eyJhbGciOiJSUzI1NiJ9.eyJleHAiOjE0NDM1ODUzNzEsInN1YiI6Imd1ZXN0IiwiYXVkIjoiSFNTTyIsImlzcyI6IkhTU08ifQ.RpA84Qdr6RxEZjg21PyVCk0G1kogvkuJI2bo302bpwbvmc-i01gCwKNeoGYzUW27MBXf6a40vylHVR3aZuuBUxsJW3aa_ltrx0R5ztKKnTWeJedOqvFKSrVlBzJJ90PzmDKCqJxA7JUhyo800_lDHLTcDWOiY-ueWYV2RMlCO0w;Path=/;Domain=localhost;Secure;HttpOnly
+	Expires: Thu, 01 Jan 1970 00:00:00 GMT
+	Location: https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS
+	Content-Length: 0
+	Server: Jetty(8.1.14.v20131031)
+</code></pre>
+<p>Note the redirect back to the original URL in the Location header and the Set-Cookie for the hadoop-jwt cookie. This is what the SSOCookieProvider in sandbox (and ultimately in your integration) will be looking for.</p>
+<p>Finally, we should be able to take the above cookie and pass it to the original url as indicated in the Location header for our originally requested resource:</p>
+<pre><code>	bash-3.2$ curl -ikH &quot;Cookie: hadoop-jwt=eyJhbGciOiJSUzI1NiJ9.eyJleHAiOjE0NDM1ODY2OTIsInN1YiI6Imd1ZXN0IiwiYXVkIjoiSFNTTyIsImlzcyI6IkhTU08ifQ.Os5HEfVBYiOIVNLRIvpYyjeLgAIMbBGXHBWMVRAEdiYcNlJRcbJJ5aSUl1aciNs1zd_SHijfB9gOdwnlvQ_0BCeGHlJBzHGyxeypIoGj9aOwEf36h-HVgqzGlBLYUk40gWAQk3aRehpIrHZT2hHm8Pu8W-zJCAwUd8HR3y6LF3M;Path=/;Domain=localhost;Secure;HttpOnly&quot; https://localhost:9443/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS
+
+	TODO: cluster was down and needs to be recreated :/
+</code></pre>
+<h4><a id="Browse+the+Flow">Browse the Flow</a> <a href="#Browse+the+Flow"><img src="markbook-section-link.png"/></a></h4>
+<p>At this point, we can use a web browser instead of the command line and see how the browser will challenge the user for Basic Auth Credentials and then manage the cookies such that the SSO and token exchange aspects of the flow are hidden from the user.</p>
+<p>Simply, try to invoke the same webhdfs API from the browser URL bar.</p>
+<pre><code>		https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS
+</code></pre>
+<p>Based on our understanding of the flow it should behave like:</p>
+<ul>
+  <li>SSOCookieProvider checks for hadoop-jwt cookie and in its absence redirects to the configured SSO provider URL (knoxsso endpoint)</li>
+  <li>ShiroProvider on the KnoxSSO endpoint returns a 401 and the browser challenges the user for username/password</li>
+  <li>The ShiroProvider authenticates the user against the Demo LDAP Server using a simple LDAP bind and establishes the security context for the WebSSO request</li>
+  <li>The WebSSO service exchanges the normalized Java Subject into a JWT token and sets it on the response as a cookie named hadoop-jwt</li>
+  <li>The WebSSO service then redirects the user agent back to the originally requested URL - the webhdfs Knox service subsequent invocations will find the cookie in the incoming request and not need to engage the WebSSO service again until it expires.</li>
+</ul>
+<h4><a id="Filter+by+Example">Filter by Example</a> <a href="#Filter+by+Example"><img src="markbook-section-link.png"/></a></h4>
+<p>We have added a federation provider to Knox for accepting KnoxSSO cookies for REST APIs. This provides us with a couple benefits: KnoxSSO support for REST APIs for XmlHttpRequests from JavaScript (basic CORS functionality is also included). This is still rather basic and considered beta code. A model and real world usecase for others to base their integrations on</p>
+<p>In addition, <a href="https://issues.apache.org/jira/browse/HADOOP-11717">https://issues.apache.org/jira/browse/HADOOP-11717</a> added support for the Hadoop UIs to the hadoop-auth module and it can be used as another example.</p>
+<p>We will examine the new SSOCookieFederationFilter in Knox here.</p>
+<pre><code>package org.apache.knox.gateway.provider.federation.jwt.filter;
+
+	import java.io.IOException;
+		import java.security.Principal;
+		import java.security.PrivilegedActionException;
+		import java.security.PrivilegedExceptionAction;
+		import java.util.ArrayList;
+		import java.util.Date;
+		import java.util.HashSet;
+		import java.util.List;
+		import java.util.Set;
+		
+		import javax.security.auth.Subject;
+		import javax.servlet.Filter;
+		import javax.servlet.FilterChain;
+		import javax.servlet.FilterConfig;
+		import javax.servlet.ServletException;
+		import javax.servlet.ServletRequest;
+		import javax.servlet.ServletResponse;
+		import javax.servlet.http.Cookie;
+		import javax.servlet.http.HttpServletRequest;
+		import javax.servlet.http.HttpServletResponse;
+		
+		import org.apache.knox.gateway.i18n.messages.MessagesFactory;
+		import org.apache.knox.gateway.provider.federation.jwt.JWTMessages;
+		import org.apache.knox.gateway.security.PrimaryPrincipal;
+		import org.apache.knox.gateway.services.GatewayServices;
+		import org.apache.knox.gateway.services.security.token.JWTokenAuthority;
+		import org.apache.knox.gateway.services.security.token.TokenServiceException;
+		import org.apache.knox.gateway.services.security.token.impl.JWTToken;
+		
+		public class SSOCookieFederationFilter implements Filter {
+		  private static JWTMessages log = MessagesFactory.get( JWTMessages.class );
+		  private static final String ORIGINAL_URL_QUERY_PARAM = &quot;originalUrl=&quot;;
+		  private static final String SSO_COOKIE_NAME = &quot;sso.cookie.name&quot;;
+		  private static final String SSO_EXPECTED_AUDIENCES = &quot;sso.expected.audiences&quot;;
+		  private static final String SSO_AUTHENTICATION_PROVIDER_URL = &quot;sso.authentication.provider.url&quot;;
+		  private static final String DEFAULT_SSO_COOKIE_NAME = &quot;hadoop-jwt&quot;;
+</code></pre>
+<p>The above represent the configurable aspects of the integration</p>
+<pre><code>    private JWTokenAuthority authority = null;
+    private String cookieName = null;
+    private List&lt;String&gt; audiences = null;
+    private String authenticationProviderUrl = null;
+
+    @Override
+    public void init( FilterConfig filterConfig ) throws ServletException {
+      GatewayServices services = (GatewayServices) filterConfig.getServletContext().getAttribute(GatewayServices.GATEWAY_SERVICES_ATTRIBUTE);
+      authority = (JWTokenAuthority)services.getService(GatewayServices.TOKEN_SERVICE);
+</code></pre>
+<p>The above is a Knox specific internal service that we use to issue and verify JWT tokens. This will be covered separately and you will need to be implement something similar in your filter implementation.</p>
+<pre><code>    // configured cookieName
+    cookieName = filterConfig.getInitParameter(SSO_COOKIE_NAME);
+    if (cookieName == null) {
+      cookieName = DEFAULT_SSO_COOKIE_NAME;
+    }
+</code></pre>
+<p>The configurable cookie name is something that can be used to change a cookie name to fit your deployment environment. The default name is hadoop-jwt which is also the default in the Hadoop implementation. This name must match the name being used by the KnoxSSO endpoint when setting the cookie.</p>
+<pre><code>    // expected audiences or null
+    String expectedAudiences = filterConfig.getInitParameter(SSO_EXPECTED_AUDIENCES);
+    if (expectedAudiences != null) {
+      audiences = parseExpectedAudiences(expectedAudiences);
+    }
+</code></pre>
+<p>Audiences are configured as a comma separated list of audience strings. Names of intended recipients or intents. The semantics that we are using for this processing is that - if not configured than any (or none) audience is accepted. If there are audiences configured then as long as one of the expected ones is found in the set of claims in the token it is accepted.</p>
+<pre><code>    // url to SSO authentication provider
+    authenticationProviderUrl = filterConfig.getInitParameter(SSO_AUTHENTICATION_PROVIDER_URL);
+    if (authenticationProviderUrl == null) {
+      log.missingAuthenticationProviderUrlConfiguration();
+    }
+  }
+</code></pre>
+<p>This is the URL to the KnoxSSO endpoint. It is required and SSO/token exchange will not work without this set correctly.</p>
+<pre><code>	/**
+   	* @param expectedAudiences
+   	* @return
+   	*/
+   	private List&lt;String&gt; parseExpectedAudiences(String expectedAudiences) {
+     ArrayList&lt;String&gt; audList = null;
+       // setup the list of valid audiences for token validation
+       if (expectedAudiences != null) {
+         // parse into the list
+         String[] audArray = expectedAudiences.split(&quot;,&quot;);
+         audList = new ArrayList&lt;String&gt;();
+         for (String a : audArray) {
+           audList.add(a);
+         }
+       }
+       return audList;
+     }
+</code></pre>
+<p>The above method parses the comma separated list of expected audiences and makes it available for interrogation during token validation.</p>
+<pre><code>    public void destroy() {
+    }
+
+    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) 
+        throws IOException, ServletException {
+      String wireToken = null;
+      HttpServletRequest req = (HttpServletRequest) request;
+  
+      String loginURL = constructLoginURL(req);
+      wireToken = getJWTFromCookie(req);
+      if (wireToken == null) {
+        if (req.getMethod().equals(&quot;OPTIONS&quot;)) {
+          // CORS preflight requests to determine allowed origins and related config
+          // must be able to continue without being redirected
+          Subject sub = new Subject();
+          sub.getPrincipals().add(new PrimaryPrincipal(&quot;anonymous&quot;));
+          continueWithEstablishedSecurityContext(sub, req, (HttpServletResponse) response, chain);
+        }
+        log.sendRedirectToLoginURL(loginURL);
+        ((HttpServletResponse) response).sendRedirect(loginURL);
+      }
+      else {
+        JWTToken token = new JWTToken(wireToken);
+        boolean verified = false;
+        try {
+          verified = authority.verifyToken(token);
+          if (verified) {
+            Date expires = token.getExpiresDate();
+            if (expires == null || new Date().before(expires)) {
+              boolean audValid = validateAudiences(token);
+              if (audValid) {
+                Subject subject = createSubjectFromToken(token);
+                continueWithEstablishedSecurityContext(subject, (HttpServletRequest)request, (HttpServletResponse)response, chain);
+              }
+              else {
+                log.failedToValidateAudience();
+                ((HttpServletResponse) response).sendRedirect(loginURL);
+              }
+            }
+            else {
+              log.tokenHasExpired();
+            ((HttpServletResponse) response).sendRedirect(loginURL);
+            }
+          }
+          else {
+            log.failedToVerifyTokenSignature();
+          ((HttpServletResponse) response).sendRedirect(loginURL);
+          }
+        } catch (TokenServiceException e) {
+          log.unableToVerifyToken(e);
+        ((HttpServletResponse) response).sendRedirect(loginURL);
+        }
+      }
+    }
+</code></pre>
+<p>The doFilter method above is where all the real work is done. We look for a cookie by the configured name. If it isn’t there then we redirect to the configured SSO provider URL in order to acquire one. That is unless it is an OPTIONS request which may be a preflight CORS request. You shouldn’t need to worry about this aspect. It is really a REST API concern not a web app UI one.</p>
+<p>Once we get a cookie, the underlying JWT token is extracted and returned as the wireToken from which we create a Knox specific JWTToken. This abstraction is around the use of the nimbus JWT library which you can use directly. We will cover those details separately.</p>
+<p>We then ask the token authority component to verify the token. This involves signature validation of the signed token. In order to verify the signature of the token you will need to have the public key of the Knox SSO server configured and provided to the nimbus library through its API at verification time. NOTE: This is a good place to look at the Hadoop implementation as an example.</p>
+<p>Once we know the token is signed by a trusted party we then validate whether it is expired and that it has an expected (or no) audience claims.</p>
+<p>Finally, when we have a valid token, we create a Java Subject from it and continue the request through the filterChain as the authenticated user.</p>
+<pre><code>	/**
+   	* Encapsulate the acquisition of the JWT token from HTTP cookies within the
+   	* request.
+   	*
+   	* @param req servlet request to get the JWT token from
+   	* @return serialized JWT token
+   	*/
+  	protected String getJWTFromCookie(HttpServletRequest req) {
+    String serializedJWT = null;
+    Cookie[] cookies = req.getCookies();
+    if (cookies != null) {
+      for (Cookie cookie : cookies) {
+        if (cookieName.equals(cookie.getName())) {
+          log.cookieHasBeenFound(cookieName);
+          serializedJWT = cookie.getValue();
+          break;
+        }
+      }
+    }
+    return serializedJWT;
+  	}
+</code></pre>
+<p>The above method extracts the serialized token from the cookie and returns it as the wireToken.</p>
+<pre><code>  	/**
+   	* Create the URL to be used for authentication of the user in the absence of
+   	* a JWT token within the incoming request.
+   	*
+   	* @param request for getting the original request URL
+   	* @return url to use as login url for redirect
+   	*/
+  	protected String constructLoginURL(HttpServletRequest request) {
+    String delimiter = &quot;?&quot;;
+    if (authenticationProviderUrl.contains(&quot;?&quot;)) {
+      delimiter = &quot;&amp;&quot;;
+    }
+    String loginURL = authenticationProviderUrl + delimiter
+        + ORIGINAL_URL_QUERY_PARAM
+        + request.getRequestURL().toString()+ getOriginalQueryString(request);
+    	return loginURL;
+  	}
+
+  	private String getOriginalQueryString(HttpServletRequest request) {
+    	String originalQueryString = request.getQueryString();
+    	return (originalQueryString == null) ? &quot;&quot; : &quot;?&quot; + originalQueryString;
+  	}
+</code></pre>
+<p>The above method creates the full URL to be used in redirecting to the KnoxSSO endpoint. It includes the SSO provider URL as well as the original request URL so that we can redirect back to it after authentication and token exchange.</p>
+<pre><code>  	/**
+   	* Validate whether any of the accepted audience claims is present in the
+   	* issued token claims list for audience. Override this method in subclasses
+   	* in order to customize the audience validation behavior.
+   	*
+   	* @param jwtToken
+   	*          the JWT token where the allowed audiences will be found
+   	* @return true if an expected audience is present, otherwise false
+   	*/
+  	protected boolean validateAudiences(JWTToken jwtToken) {
+    	boolean valid = false;
+    	String[] tokenAudienceList = jwtToken.getAudienceClaims();
+    	// if there were no expected audiences configured then just
+    	// consider any audience acceptable
+    	if (audiences == null) {
+      		valid = true;
+    	} else {
+      		// if any of the configured audiences is found then consider it
+      		// acceptable
+      		for (String aud : tokenAudienceList) {
+        	if (audiences.contains(aud)) {
+          		//log.debug(&quot;JWT token audience has been successfully validated&quot;);
+          		log.jwtAudienceValidated();
+          		valid = true;
+          		break;
+        	}
+      	}
+    }
+    return valid;
+  	}
+</code></pre>
+<p>The above method implements the audience claim semantics explained earlier.</p>
+<pre><code>	private void continueWithEstablishedSecurityContext(Subject subject, final 		HttpServletRequest request, final HttpServletResponse response, final FilterChain chain) throws IOException, ServletException {
+    try {
+      Subject.doAs(
+        subject,
+        new PrivilegedExceptionAction&lt;Object&gt;() {
+          @Override
+          public Object run() throws Exception {
+            chain.doFilter(request, response);
+            return null;
+          }
+        }
+        );
+    }
+    catch (PrivilegedActionException e) {
+      Throwable t = e.getCause();
+      if (t instanceof IOException) {
+        throw (IOException) t;
+      }
+      else if (t instanceof ServletException) {
+        throw (ServletException) t;
+      }
+      else {
+        throw new ServletException(t);
+      }
+    }
+  	}
+</code></pre>
+<p>This method continues the filter chain processing upon successful validation of the token. This would need to be replaced with your environment’s equivalent of continuing the request or login to the app as the authenticated user.</p>
+<pre><code>  	private Subject createSubjectFromToken(JWTToken token) {
+    	final String principal = token.getSubject();
+    	@SuppressWarnings(&quot;rawtypes&quot;)
+    	HashSet emptySet = new HashSet();
+    	Set&lt;Principal&gt; principals = new HashSet&lt;Principal&gt;();
+    	Principal p = new PrimaryPrincipal(principal);
+    	principals.add(p);
+    	javax.security.auth.Subject subject = new javax.security.auth.Subject(true, principals, emptySet, emptySet);
+    	return subject;
+  	}
+</code></pre>
+<p>This method takes a JWTToken and creates a Java Subject with the principals expected by the rest of the Knox processing. This would need to be implemented in a way appropriate for your operating environment as well. For instance, the Hadoop handler implementation returns a Hadoop AuthenticationToken to the calling filter which in turn ends up in the Hadoop auth cookie.</p>
+<pre><code>	}
+</code></pre>
+<h4><a id="Token+Signature+Validation">Token Signature Validation</a> <a href="#Token+Signature+Validation"><img src="markbook-section-link.png"/></a></h4>
+<p>The following is the method from the Hadoop handler implementation that validates the signature.</p>
+<pre><code>	/** 
+ 	* Verify the signature of the JWT token in this method. This method depends on the 	* public key that was established during init based upon the provisioned public key. 	* Override this method in subclasses in order to customize the signature verification behavior.
+ 	* @param jwtToken the token that contains the signature to be validated
+ 	* @return valid true if signature verifies successfully; false otherwise
+ 	*/
+	protected boolean validateSignature(SignedJWT jwtToken){
+  		boolean valid=false;
+  		if (JWSObject.State.SIGNED == jwtToken.getState()) {
+    		LOG.debug(&quot;JWT token is in a SIGNED state&quot;);
+    		if (jwtToken.getSignature() != null) {
+      			LOG.debug(&quot;JWT token signature is not null&quot;);
+      			try {
+        			JWSVerifier verifier=new RSASSAVerifier(publicKey);
+        			if (jwtToken.verify(verifier)) {
+          			valid=true;
+          			LOG.debug(&quot;JWT token has been successfully verified&quot;);
+        		}
+ 			else {
+          		LOG.warn(&quot;JWT signature verification failed.&quot;);
+        	}
+      	}
+ 		catch (JOSEException je) {
+        	LOG.warn(&quot;Error while validating signature&quot;,je);
+      	}
+    }
+  	}
+  	return valid;
+	}
+</code></pre>
+<p>Hadoop Configuration Example The following is like the configuration in the Hadoop handler implementation.</p>
+<p>OBSOLETE but in the proper spirit of HADOOP-11717 ( HADOOP-11717 - Add Redirecting WebSSO behavior with JWT Token in Hadoop Auth RESOLVED )</p>
+<pre><code>	&lt;property&gt;
+  		&lt;name&gt;hadoop.http.authentication.type&lt;/name&gt;
+		&lt;value&gt;org.apache.hadoop/security.authentication/server.JWTRedirectAuthenticationHandler&lt;/value&gt;
+	&lt;/property&gt;
+</code></pre>
+<p>This is the handler classname in Hadoop auth for JWT token (KnoxSSO) support.</p>
+<pre><code>	&lt;property&gt;
+  		&lt;name&gt;hadoop.http.authentication.authentication.provider.url&lt;/name&gt;
+  		&lt;value&gt;http://c6401.ambari.apache.org:8888/knoxsso&lt;/value&gt;
+	&lt;/property&gt;
+</code></pre>
+<p>The above property is the SSO provider URL that points to the knoxsso endpoint.</p>
+<pre><code>	&lt;property&gt;
+   		&lt;name&gt;hadoop.http.authentication.public.key.pem&lt;/name&gt;
+   		&lt;value&gt;MIICVjCCAb+gAwIBAgIJAPPvOtuTxFeiMA0GCSqGSIb3DQEBBQUAMG0xCzAJBgNV
+   	BAYTAlVTMQ0wCwYDVQQIEwRUZXN0MQ0wCwYDVQQHEwRUZXN0MQ8wDQYDVQQKEwZI
+   	YWRvb3AxDTALBgNVBAsTBFRlc3QxIDAeBgNVBAMTF2M2NDAxLmFtYmFyaS5hcGFj
+   	aGUub3JnMB4XDTE1MDcxNjE4NDcyM1oXDTE2MDcxNTE4NDcyM1owbTELMAkGA1UE
+   	BhMCVVMxDTALBgNVBAgTBFRlc3QxDTALBgNVBAcTBFRlc3QxDzANBgNVBAoTBkhh
+   	ZG9vcDENMAsGA1UECxMEVGVzdDEgMB4GA1UEAxMXYzY0MDEuYW1iYXJpLmFwYWNo
+   	ZS5vcmcwgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBAMFs/rymbiNvg8lDhsdA
+   	qvh5uHP6iMtfv9IYpDleShjkS1C+IqId6bwGIEO8yhIS5BnfUR/fcnHi2ZNrXX7x
+   	QUtQe7M9tDIKu48w//InnZ6VpAqjGShWxcSzR6UB/YoGe5ytHS6MrXaormfBg3VW
+   	tDoy2MS83W8pweS6p5JnK7S5AgMBAAEwDQYJKoZIhvcNAQEFBQADgYEANyVg6EzE
+   	2q84gq7wQfLt9t047nYFkxcRfzhNVL3LB8p6IkM4RUrzWq4kLA+z+bpY2OdpkTOe
+   	wUpEdVKzOQd4V7vRxpdANxtbG/XXrJAAcY/S+eMy1eDK73cmaVPnxPUGWmMnQXUi
+   	TLab+w8tBQhNbq6BOQ42aOrLxA8k/M4cV1A=&lt;/value&gt;
+	&lt;/property&gt;
+</code></pre>
+<p>The above property holds the KnoxSSO server’s public key for signature verification. Adding it directly to the config like this is convenient and is easily done through Ambari to existing config files that take custom properties. Config is generally protected as root access only as well - so it is a pretty good solution.</p>
+<h4><a id="Public+Key+Parsing">Public Key Parsing</a> <a href="#Public+Key+Parsing"><img src="markbook-section-link.png"/></a></h4>
+<p>In order to turn the pem encoded config item into a public key the hadoop handler implementation does the following in the init() method.</p>
+<pre><code>   	if (publicKey == null) {
+     String pemPublicKey = config.getProperty(PUBLIC_KEY_PEM);
+     if (pemPublicKey == null) {
+       throw new ServletException(
+           &quot;Public key for signature validation must be provisioned.&quot;);
+     }
+     publicKey = CertificateUtil.parseRSAPublicKey(pemPublicKey);
+   }
+</code></pre>
+<p>and the CertificateUtil class is below:</p>
+<pre><code>	package org.apache.hadoop.security.authentication.util;
+
+	import java.io.ByteArrayInputStream;
+	import java.io.UnsupportedEncodingException;
+	import java.security.PublicKey;
+	import java.security.cert.CertificateException;
+	import java.security.cert.CertificateFactory;
+	import java.security.cert.X509Certificate;
+	import java.security.interfaces.RSAPublicKey;
+
+	import javax.servlet.ServletException;
+
+	public class CertificateUtil {
+		private static final String PEM_HEADER = &quot;-----BEGIN CERTIFICATE-----\n&quot;;
+		private static final String PEM_FOOTER = &quot;\n-----END CERTIFICATE-----&quot;;
+
+	 /**
+	  * Gets an RSAPublicKey from the provided PEM encoding.
+ 	  *
+  	  * @param pem
+      *          - the pem encoding from config without the header and footer
+      * @return RSAPublicKey the RSA public key
+      * @throws ServletException thrown if a processing error occurred
+      */
+ 	public static RSAPublicKey parseRSAPublicKey(String pem) throws ServletException {
+   		String fullPem = PEM_HEADER + pem + PEM_FOOTER;
+   		PublicKey key = null;
+   		try {
+     		CertificateFactory fact = CertificateFactory.getInstance(&quot;X.509&quot;);
+     		ByteArrayInputStream is = new ByteArrayInputStream(
+         		fullPem.getBytes(&quot;UTF8&quot;));
+     		X509Certificate cer = (X509Certificate) fact.generateCertificate(is);
+     		key = cer.getPublicKey();
+   		} catch (CertificateException ce) {
+     		String message = null;
+     		if (pem.startsWith(PEM_HEADER)) {
+       			message = &quot;CertificateException - be sure not to include PEM header &quot;
+           			+ &quot;and footer in the PEM configuration element.&quot;;
+     		} else {
+       			message = &quot;CertificateException - PEM may be corrupt&quot;;
+     		}
+     		throw new ServletException(message, ce);
+   		} catch (UnsupportedEncodingException uee) {
+     		throw new ServletException(uee);
+   		}
+   		return (RSAPublicKey) key;
+ 		}
+	}
+</code></pre>
\ No newline at end of file

Added: knox/site/books/knox-1-6-0/markbook-section-link.png
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/markbook-section-link.png?rev=1891689&view=auto
==============================================================================
Binary file - no diff available.

Propchange: knox/site/books/knox-1-6-0/markbook-section-link.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: knox/site/books/knox-1-6-0/plus.png
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/plus.png?rev=1891689&view=auto
==============================================================================
Binary file - no diff available.

Propchange: knox/site/books/knox-1-6-0/plus.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: knox/site/books/knox-1-6-0/question.png
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/question.png?rev=1891689&view=auto
==============================================================================
Binary file - no diff available.

Propchange: knox/site/books/knox-1-6-0/question.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: knox/site/books/knox-1-6-0/runtime-overview.png
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/runtime-overview.png?rev=1891689&view=auto
==============================================================================
Binary file - no diff available.

Propchange: knox/site/books/knox-1-6-0/runtime-overview.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: knox/site/books/knox-1-6-0/runtime-request-processing.png
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/runtime-request-processing.png?rev=1891689&view=auto
==============================================================================
Binary file - no diff available.

Propchange: knox/site/books/knox-1-6-0/runtime-request-processing.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: knox/site/books/knox-1-6-0/star.png
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/star.png?rev=1891689&view=auto
==============================================================================
Binary file - no diff available.

Propchange: knox/site/books/knox-1-6-0/star.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: knox/site/books/knox-1-6-0/stop.png
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-6-0/stop.png?rev=1891689&view=auto
==============================================================================
Binary file - no diff available.

Propchange: knox/site/books/knox-1-6-0/stop.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream