You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by bu...@apache.org on 2016/07/14 19:47:07 UTC

[44/52] [partial] hbase-site git commit: Published site at a55af38689fbe273e716ebbf6191e9515986dbf3.

http://git-wip-us.apache.org/repos/asf/hbase-site/blob/975096b1/book.html
----------------------------------------------------------------------
diff --git a/book.html b/book.html
index 3e5793e..2dee923 100644
--- a/book.html
+++ b/book.html
@@ -17111,7 +17111,8 @@ of the <a href="#security">Securing Apache HBase</a> chapter.</p>
 <p>The following examples use the placeholder server http://example.com:8000, and
 the following commands can all be run using <code>curl</code> or <code>wget</code> commands. You can request
 plain text (the default), XML , or JSON output by adding no header for plain text,
-or the header "Accept: text/xml" for XML or "Accept: application/json" for JSON.</p>
+or the header "Accept: text/xml" for XML, "Accept: application/json" for JSON, or
+"Accept: application/x-protobuf" to for protocol buffers.</p>
 </div>
 <div class="admonitionblock note">
 <table>
@@ -17126,171 +17127,345 @@ creation or mutation, and <code>DELETE</code> for deletion.
 </tr>
 </table>
 </div>
-<div class="sect3">
-<h4 id="_cluster_information"><a class="anchor" href="#_cluster_information"></a>76.3.1. Cluster Information</h4>
-<div class="listingblock">
-<div class="title">HBase Version</div>
-<div class="content">
-<pre>http://example.com:8000/version/cluster</pre>
-</div>
-</div>
-<div class="listingblock">
-<div class="title">Cluster Status</div>
-<div class="content">
-<pre>http://example.com:8000/status/cluster</pre>
-</div>
-</div>
-<div class="listingblock">
-<div class="title">Table List</div>
-<div class="content">
-<pre>http://example.com:8000/</pre>
-</div>
-</div>
-</div>
-<div class="sect3">
-<h4 id="_table_information"><a class="anchor" href="#_table_information"></a>76.3.2. Table Information</h4>
-<div class="paragraph">
-<div class="title">Table Schema (GET)</div>
-<p>To retrieve the table schema, use a <code>GET</code> request with the <code>/schema</code> endpoint:</p>
-</div>
-<div class="listingblock">
-<div class="content">
-<pre>http://example.com:8000/&lt;table&gt;/schema</pre>
-</div>
-</div>
-<div class="paragraph">
-<div class="title">Table Creation</div>
-<p>To create a table, use a <code>PUT</code> request with the <code>/schema</code> endpoint:</p>
-</div>
-<div class="listingblock">
-<div class="content">
-<pre>http://example.com:8000/&lt;table&gt;/schema</pre>
-</div>
-</div>
-<div class="paragraph">
-<div class="title">Table Schema Update</div>
-<p>To update a table, use a <code>POST</code> request with the <code>/schema</code> endpoint:</p>
-</div>
-<div class="listingblock">
-<div class="content">
-<pre>http://example.com:8000/&lt;table&gt;/schema</pre>
-</div>
-</div>
-<div class="paragraph">
-<div class="title">Table Deletion</div>
-<p>To delete a table, use a <code>DELETE</code> request with the <code>/schema</code> endpoint:</p>
-</div>
-<div class="listingblock">
-<div class="content">
-<pre>http://example.com:8000/&lt;table&gt;/schema</pre>
-</div>
-</div>
-<div class="listingblock">
-<div class="title">Table Regions</div>
-<div class="content">
-<pre>http://example.com:8000/&lt;table&gt;/regions</pre>
-</div>
-</div>
-</div>
-<div class="sect3">
-<h4 id="_gets"><a class="anchor" href="#_gets"></a>76.3.3. Gets</h4>
-<div class="paragraph">
-<div class="title">GET a Single Cell Value</div>
-<p>To get a single cell value, use a URL scheme like the following:</p>
-</div>
-<div class="listingblock">
-<div class="content">
-<pre>http://example.com:8000/&lt;table&gt;/&lt;row&gt;/&lt;column&gt;:&lt;qualifier&gt;/&lt;timestamp&gt;/content:raw</pre>
-</div>
-</div>
-<div class="paragraph">
-<p>The column qualifier and timestamp are optional. Without them, the whole row will
-be returned, or the newest version will be returned.</p>
-</div>
-<div class="paragraph">
-<div class="title">Multiple Single Values (Multi-Get)</div>
-<p>To get multiple single values, specify multiple column:qualifier tuples and/or a start-timestamp
-and end-timestamp. You can also limit the number of versions.</p>
-</div>
-<div class="listingblock">
-<div class="content">
-<pre>http://example.com:8000/&lt;table&gt;/&lt;row&gt;/&lt;column&gt;:&lt;qualifier&gt;?v=&lt;num-versions&gt;</pre>
-</div>
-</div>
-<div class="paragraph">
-<div class="title">Globbing Rows</div>
-<p>To scan a series of rows, you can use a <code>*</code> glob
-character on the &lt;row&gt; value to glob together multiple rows.</p>
-</div>
-<div class="listingblock">
-<div class="content">
-<pre>http://example.com:8000/urls/https|ad.doubleclick.net|*</pre>
-</div>
-</div>
-</div>
-<div class="sect3">
-<h4 id="_puts"><a class="anchor" href="#_puts"></a>76.3.4. Puts</h4>
-<div class="paragraph">
-<p>For Puts, <code>PUT</code> and <code>POST</code> are equivalent.</p>
-</div>
-<div class="paragraph">
-<div class="title">Put a Single Value</div>
-<p>The column qualifier and the timestamp are optional.</p>
-</div>
-<div class="listingblock">
-<div class="content">
-<pre>http://example.com:8000/put/&lt;table&gt;/&lt;row&gt;/&lt;column&gt;:&lt;qualifier&gt;/&lt;timestamp&gt;
-http://example.com:8000/test/testrow/test:testcolumn</pre>
-</div>
-</div>
-<div class="paragraph">
-<div class="title">Put Multiple Values</div>
-<p>To put multiple values, use a false row key. Row, column, and timestamp values in
-the supplied cells override the specifications on the path, allowing you to post
-multiple values to a table in batch. The HTTP response code indicates the status of
-the put. Set the <code>Content-Type</code> to <code>text/xml</code> for XML encoding or to <code>application/x-protobuf</code>
-for protobufs encoding. Supply the commit data in the <code>PUT</code> or <code>POST</code> body, using
-the <a href="#xml_schema">REST XML Schema</a> and <a href="#protobufs_schema">REST Protobufs Schema</a> as guidelines.</p>
-</div>
-</div>
-<div class="sect3">
-<h4 id="_scans"><a class="anchor" href="#_scans"></a>76.3.5. Scans</h4>
-<div class="paragraph">
-<p><code>PUT</code> and <code>POST</code> are equivalent for scans.</p>
-</div>
-<div class="paragraph">
-<div class="title">Scanner Creation</div>
-<p>To create a scanner, use the <code>/scanner</code> endpoint. The HTTP response code indicates
-success (201) or failure (anything else), and on successful scanner creation, the
-URI is returned which should be used to address the scanner.</p>
-</div>
-<div class="listingblock">
-<div class="content">
-<pre>http://example.com:8000/&lt;table&gt;/scanner</pre>
-</div>
-</div>
-<div class="paragraph">
-<div class="title">Scanner Get Next</div>
-<p>To get the next batch of cells found by the scanner, use the <code>/scanner/&lt;scanner-id&gt;'
-endpoint, using the URI returned by the scanner creation endpoint. If the scanner
-is exhausted, HTTP status `204</code> is returned.</p>
-</div>
-<div class="listingblock">
-<div class="content">
-<pre>http://example.com:8000/&lt;table&gt;/scanner/&lt;scanner-id&gt;</pre>
-</div>
-</div>
-<div class="paragraph">
-<div class="title">Scanner Deletion</div>
-<p>To delete resources associated with a scanner, send a HTTP <code>DELETE</code> request to the
-<code>/scanner/&lt;scanner-id&gt;</code> endpoint.</p>
-</div>
-<div class="listingblock">
-<div class="content">
-<pre>http://example.com:8000/&lt;table&gt;/scanner/&lt;scanner-id&gt;</pre>
-</div>
-</div>
-</div>
+<table class="tableblock frame-all grid-all spread">
+<caption class="title">Table 11. Cluster-Wide Endpoints</caption>
+<colgroup>
+<col style="width: 16%;">
+<col style="width: 8%;">
+<col style="width: 25%;">
+<col style="width: 50%;">
+</colgroup>
+<thead>
+<tr>
+<th class="tableblock halign-left valign-top">Endpoint</th>
+<th class="tableblock halign-left valign-top">HTTP Verb</th>
+<th class="tableblock halign-left valign-top">Description</th>
+<th class="tableblock halign-left valign-top">Example</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/version/cluster</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>GET</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Version of HBase running on this cluster</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X GET \
+  -H "Accept: text/xml" \
+  "http://example.com:8000/version/cluster"</pre></div></td>
+</tr>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/status/cluster</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>GET</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Cluster status</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X GET \
+  -H "Accept: text/xml" \
+  "http://example.com:8000/status/cluster"</pre></div></td>
+</tr>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>GET</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">List of all non-system tables</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X GET \
+  -H "Accept: text/xml" \
+  "http://example.com:8000/"</pre></div></td>
+</tr>
+</tbody>
+</table>
+<table class="tableblock frame-all grid-all spread">
+<caption class="title">Table 12. Namespace Endpoints</caption>
+<colgroup>
+<col style="width: 16%;">
+<col style="width: 8%;">
+<col style="width: 25%;">
+<col style="width: 50%;">
+</colgroup>
+<thead>
+<tr>
+<th class="tableblock halign-left valign-top">Endpoint</th>
+<th class="tableblock halign-left valign-top">HTTP Verb</th>
+<th class="tableblock halign-left valign-top">Description</th>
+<th class="tableblock halign-left valign-top">Example</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/namespaces</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>GET</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">List all namespaces</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X GET \
+  -H "Accept: text/xml" \
+  "http://example.com:8000/namespaces/"</pre></div></td>
+</tr>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/namespaces/<em>namespace</em></code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>GET</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Describe a specific namespace</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X GET \
+  -H "Accept: text/xml" \
+  "http://example.com:8000/namespaces/special_ns"</pre></div></td>
+</tr>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/namespaces/<em>namespace</em></code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>POST</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Create a new namespace</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X POST \
+  -H "Accept: text/xml" \
+  "example.com:8000/namespaces/special_ns"</pre></div></td>
+</tr>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/namespaces/<em>namespace</em>/tables</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>GET</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">List all tables in a specific namespace</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X GET \
+  -H "Accept: text/xml" \
+  "http://example.com:8000/namespaces/special_ns/tables"</pre></div></td>
+</tr>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/namespaces/<em>namespace</em></code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>PUT</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Alter an existing namespace. Currently not used.</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X PUT \
+  -H "Accept: text/xml" \
+  "http://example.com:8000/namespaces/special_ns</pre></div></td>
+</tr>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/namespaces/<em>namespace</em></code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>DELETE</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Delete a namespace. The namespace must be empty.</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X DELETE \
+  -H "Accept: text/xml" \
+  "example.com:8000/namespaces/special_ns"</pre></div></td>
+</tr>
+</tbody>
+</table>
+<table class="tableblock frame-all grid-all spread">
+<caption class="title">Table 13. Table Endpoints</caption>
+<colgroup>
+<col style="width: 16%;">
+<col style="width: 8%;">
+<col style="width: 25%;">
+<col style="width: 50%;">
+</colgroup>
+<thead>
+<tr>
+<th class="tableblock halign-left valign-top">Endpoint</th>
+<th class="tableblock halign-left valign-top">HTTP Verb</th>
+<th class="tableblock halign-left valign-top">Description</th>
+<th class="tableblock halign-left valign-top">Example</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/<em>table</em>/schema</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>GET</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Describe the schema of the specified table.</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X GET \
+  -H "Accept: text/xml" \
+  "http://example.com:8000/users/schema"</pre></div></td>
+</tr>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/<em>table</em>/schema</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>POST</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Create a new table, or replace an existing table&#8217;s schema</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X POST \
+  -H "Accept: text/xml" \
+  -H "Content-Type: text/xml" \
+  -d '&lt;?xml version="1.0" encoding="UTF-8"?&gt;&lt;TableSchema name="users"&gt;&lt;ColumnSchema name="cf" /&gt;&lt;/TableSchema&gt;' \
+  "http://example.com:8000/users/schema"</pre></div></td>
+</tr>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/<em>table</em>/schema</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>PUT</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Update an existing table with the provided schema fragment</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X PUT \
+  -H "Accept: text/xml" \
+  -H "Content-Type: text/xml" \
+  -d '&lt;?xml version="1.0" encoding="UTF-8"?&gt;&lt;TableSchema name="users"&gt;&lt;ColumnSchema name="cf" KEEP_DELETED_CELLS="true" /&gt;&lt;/TableSchema&gt;' \
+  "http://example.com:8000/users/schema"</pre></div></td>
+</tr>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/<em>table</em>/schema</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>DELETE</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Delete the table. You must use the <code>/<em>table</em>/schema</code> endpoint, not just <code>/<em>table</em>/</code>.</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X DELETE \
+  -H "Accept: text/xml" \
+  "http://example.com:8000/users/schema"</pre></div></td>
+</tr>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/<em>table</em>/regions</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>GET</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">List the table regions</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X GET \
+  -H "Accept: text/xml" \
+  "http://example.com:8000/users/regions</pre></div></td>
+</tr>
+</tbody>
+</table>
+<table class="tableblock frame-all grid-all spread">
+<caption class="title">Table 14. Endpoints for <code>Get</code> Operations</caption>
+<colgroup>
+<col style="width: 16%;">
+<col style="width: 8%;">
+<col style="width: 25%;">
+<col style="width: 50%;">
+</colgroup>
+<thead>
+<tr>
+<th class="tableblock halign-left valign-top">Endpoint</th>
+<th class="tableblock halign-left valign-top">HTTP Verb</th>
+<th class="tableblock halign-left valign-top">Description</th>
+<th class="tableblock halign-left valign-top">Example</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/<em>table</em>/<em>row</em>/<em>column:qualifier</em>/<em>timestamp</em></code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>GET</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Get the value of a single row. Values are Base-64 encoded.</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X GET \
+  -H "Accept: text/xml" \
+  "http://example.com:8000/users/row1"
+
+curl -vi -X GET \
+  -H "Accept: text/xml" \
+  "http://example.com:8000/users/row1/cf:a/1458586888395"</pre></div></td>
+</tr>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/<em>table</em>/<em>row</em>/<em>column:qualifier</em></code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>GET</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Get the value of a single column. Values are Base-64 encoded.</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X GET \
+  -H "Accept: text/xml" \
+  "http://example.com:8000/users/row1/cf:a"
+
+curl -vi -X GET \
+  -H "Accept: text/xml" \
+   "http://example.com:8000/users/row1/cf:a/"</pre></div></td>
+</tr>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/<em>table</em>/<em>row</em>/<em>column:qualifier</em>/?v=<em>number_of_versions</em></code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>GET</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Multi-Get a specified number of versions of a given cell. Values are Base-64 encoded.</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X GET \
+  -H "Accept: text/xml" \
+  "http://example.com:8000/users/row1/cf:a?v=2"</pre></div></td>
+</tr>
+</tbody>
+</table>
+<table class="tableblock frame-all grid-all spread">
+<caption class="title">Table 15. Endpoints for <code>Scan</code> Operations</caption>
+<colgroup>
+<col style="width: 16%;">
+<col style="width: 8%;">
+<col style="width: 25%;">
+<col style="width: 50%;">
+</colgroup>
+<thead>
+<tr>
+<th class="tableblock halign-left valign-top">Endpoint</th>
+<th class="tableblock halign-left valign-top">HTTP Verb</th>
+<th class="tableblock halign-left valign-top">Description</th>
+<th class="tableblock halign-left valign-top">Example</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/<em>table</em>/scanner/</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>PUT</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Get a Scanner object. Required by all other Scan operations. Adjust the batch parameter
+to the number of rows the scan should return in a batch. See the next example for
+adding filters to your scanner. The scanner endpoint URL is returned as the <code>Location</code>
+in the HTTP response. The other examples in this table assume that the scanner endpoint
+is <code>http://example.com:8000/users/scanner/145869072824375522207</code>.</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X PUT \
+  -H "Accept: text/xml" \
+  -H "Content-Type: text/xml" \
+  -d '&lt;Scanner batch="1"/&gt;' \
+  "http://example.com:8000/users/scanner/"</pre></div></td>
+</tr>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/<em>table</em>/scanner/</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>PUT</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">To supply filters to the Scanner object or configure the
+Scanner in any other way, you can create a text file and add
+your filter to the file. For example, to return only rows for
+which keys start with &lt;codeph&gt;u123&lt;/codeph&gt; and use a batch size
+of 100, the filter file would look like this:
+</p><p class="tableblock"></p><p class="tableblock"><pre>
+&lt;Scanner batch="100"&gt;
+  &lt;filter&gt;
+    {
+      "type": "PrefixFilter",
+      "value": "u123"
+    }
+  &lt;/filter&gt;
+&lt;/Scanner&gt;
+</pre>
+</p><p class="tableblock"></p><p class="tableblock">Pass the file to the <code>-d</code> argument of the <code>curl</code> request.</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X PUT \
+  -H "Accept: text/xml" \
+  -H "Content-Type:text/xml" \
+  -d @filter.txt \
+  "http://example.com:8000/users/scanner/"</pre></div></td>
+</tr>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/<em>table</em>/scanner/<em>scanner-id</em></code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>GET</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Get the next batch from the scanner. Cell values are byte-encoded. If the scanner
+has been exhausted, HTTP status <code>204</code> is returned.</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X GET \
+  -H "Accept: text/xml" \
+  "http://example.com:8000/users/scanner/145869072824375522207"</pre></div></td>
+</tr>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code><em>table</em>/scanner/<em>scanner-id</em></code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>DELETE</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Deletes the scanner and frees the resources it used.</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X DELETE \
+  -H "Accept: text/xml" \
+  "http://example.com:8000/users/scanner/145869072824375522207"</pre></div></td>
+</tr>
+</tbody>
+</table>
+<table class="tableblock frame-all grid-all spread">
+<caption class="title">Table 16. Endpoints for <code>Put</code> Operations</caption>
+<colgroup>
+<col style="width: 16%;">
+<col style="width: 8%;">
+<col style="width: 25%;">
+<col style="width: 50%;">
+</colgroup>
+<thead>
+<tr>
+<th class="tableblock halign-left valign-top">Endpoint</th>
+<th class="tableblock halign-left valign-top">HTTP Verb</th>
+<th class="tableblock halign-left valign-top">Description</th>
+<th class="tableblock halign-left valign-top">Example</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>/<em>table</em>/<em>row_key</em></code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><code>PUT</code></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Write a row to a table. The row, column qualifier, and value must each be Base-64
+encoded. To encode a string, use the <code>base64</code> command-line utility. To decode the
+string, use <code>base64 -d</code>. The payload is in the <code>--data</code> argument, and the <code>/users/fakerow</code>
+value is a placeholder. Insert multiple rows by adding them to the <code>&lt;CellSet&gt;</code>
+element. You can also save the data to be inserted to a file and pass it to the <code>-d</code>
+parameter with syntax like <code>-d @filename.txt</code>.</p></td>
+<td class="tableblock halign-left valign-top"><div class="literal"><pre>curl -vi -X PUT \
+  -H "Accept: text/xml" \
+  -H "Content-Type: text/xml" \
+  -d '&lt;?xml version="1.0" encoding="UTF-8" standalone="yes"?&gt;&lt;CellSet&gt;&lt;Row key="cm93NQo="&gt;&lt;Cell column="Y2Y6ZQo="&gt;dmFsdWU1Cg==&lt;/Cell&gt;&lt;/Row&gt;&lt;/CellSet&gt;' \
+  "http://example.com:8000/users/fakerow"
+
+curl -vi -X PUT \
+  -H "Accept: text/json" \
+  -H "Content-Type: text/json" \
+  -d '{"Row":[{"key":"cm93NQo=", "Cell": [{"column":"Y2Y6ZQo=", "$":"dmFsdWU1Cg=="}]}]}'' \
+  "example.com:8000/users/fakerow"</pre></div></td>
+</tr>
+</tbody>
+</table>
 </div>
 <div class="sect2">
 <h3 id="xml_schema"><a class="anchor" href="#xml_schema"></a>76.4. REST XML Schema</h3>
@@ -17427,7 +17602,7 @@ is exhausted, HTTP status `204</code> is returned.</p>
   <span class="tag">&lt;complexType</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">&quot;</span><span class="content">Node</span><span class="delimiter">&quot;</span></span><span class="tag">&gt;</span>
     <span class="tag">&lt;sequence&gt;</span>
       <span class="tag">&lt;element</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">&quot;</span><span class="content">region</span><span class="delimiter">&quot;</span></span> <span class="attribute-name">type</span>=<span class="string"><span class="delimiter">&quot;</span><span class="content">tns:Region</span><span class="delimiter">&quot;</span></span>
-          <span class="attribute-name">maxOccurs</span>=<span class="string"><span class="delimiter">&quot;</span><span class="content">unbounded</span><span class="delimiter">&quot;</span></span> <span class="attribute-name">minOccurs</span>=<span class="string"><span class="delimiter">&quot;</span><span class="content">0</span><span class="delimiter">&quot;</span></span><span class="tag">&gt;</span>
+   <span class="attribute-name">maxOccurs</span>=<span class="string"><span class="delimiter">&quot;</span><span class="content">unbounded</span><span class="delimiter">&quot;</span></span> <span class="attribute-name">minOccurs</span>=<span class="string"><span class="delimiter">&quot;</span><span class="content">0</span><span class="delimiter">&quot;</span></span><span class="tag">&gt;</span>
       <span class="tag">&lt;/element&gt;</span>
     <span class="tag">&lt;/sequence&gt;</span>
     <span class="tag">&lt;attribute</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">&quot;</span><span class="content">name</span><span class="delimiter">&quot;</span></span> <span class="attribute-name">type</span>=<span class="string"><span class="delimiter">&quot;</span><span class="content">string</span><span class="delimiter">&quot;</span></span><span class="tag">&gt;</span><span class="tag">&lt;/attribute&gt;</span>
@@ -17654,7 +17829,7 @@ a row, get a column value, perform a query, and do some additional HBase operati
 
     <span class="comment">//*drop if table is already exist.*</span>
     <span class="keyword">if</span>(dbo.isTableExist(<span class="string"><span class="delimiter">&quot;</span><span class="content">user</span><span class="delimiter">&quot;</span></span>)){
-            dbo.deleteTable(<span class="string"><span class="delimiter">&quot;</span><span class="content">user</span><span class="delimiter">&quot;</span></span>);
+     dbo.deleteTable(<span class="string"><span class="delimiter">&quot;</span><span class="content">user</span><span class="delimiter">&quot;</span></span>);
     }
 
     <span class="comment">//*create table*</span>
@@ -18841,25 +19016,165 @@ values for this row for all column families.</p>
 <h2 id="_sparksql_dataframes"><a class="anchor" href="#_sparksql_dataframes"></a>86. SparkSQL/DataFrames</h2>
 <div class="sectionbody">
 <div class="paragraph">
-<p><a href="http://spark.apache.org/sql/">SparkSQL</a> is a subproject of Spark that supports
-SQL that will compute down to a Spark DAG. In addition,SparkSQL is a heavy user
-of DataFrames. DataFrames are like RDDs with schema information.</p>
+<p>HBase-Spark Connector (in HBase-Spark Module) leverages
+<a href="https://databricks.com/blog/2015/01/09/spark-sql-data-sources-api-unified-data-access-for-the-spark-platform.html">DataSource API</a>
+(<a href="https://issues.apache.org/jira/browse/SPARK-3247">SPARK-3247</a>)
+introduced in Spark-1.2.0, bridges the gap between simple HBase KV store and complex
+relational SQL queries and enables users to perform complex data analytical work
+on top of HBase using Spark. HBase Dataframe is a standard Spark Dataframe, and is able to
+interact with any other data sources such as Hive, Orc, Parquet, JSON, etc.
+HBase-Spark Connector applies critical techniques such as partition pruning, column pruning,
+predicate pushdown and data locality.</p>
+</div>
+<div class="paragraph">
+<p>To use HBase-Spark connector, users need to define the Catalog for the schema mapping
+between HBase and Spark tables, prepare the data and populate the HBase table,
+then load HBase DataFrame. After that, users can do integrated query and access records
+in HBase table with SQL query. Following illustrates the basic procedure.</p>
+</div>
+<div class="sect2">
+<h3 id="_define_catalog"><a class="anchor" href="#_define_catalog"></a>86.1. Define catalog</h3>
+<div class="listingblock">
+<div class="content">
+<pre class="CodeRay highlight"><code data-lang="scala">def catalog = s&quot;&quot;&quot;{
+�������|&quot;table&quot;:{&quot;namespace&quot;:&quot;default&quot;, &quot;name&quot;:&quot;table1&quot;},
+�������|&quot;rowkey&quot;:&quot;key&quot;,
+�������|&quot;columns&quot;:{
+���������|&quot;col0&quot;:{&quot;cf&quot;:&quot;rowkey&quot;, &quot;col&quot;:&quot;key&quot;, &quot;type&quot;:&quot;string&quot;},
+���������|&quot;col1&quot;:{&quot;cf&quot;:&quot;cf1&quot;, &quot;col&quot;:&quot;col1&quot;, &quot;type&quot;:&quot;boolean&quot;},
+���������|&quot;col2&quot;:{&quot;cf&quot;:&quot;cf2&quot;, &quot;col&quot;:&quot;col2&quot;, &quot;type&quot;:&quot;double&quot;},
+���������|&quot;col3&quot;:{&quot;cf&quot;:&quot;cf3&quot;, &quot;col&quot;:&quot;col3&quot;, &quot;type&quot;:&quot;float&quot;},
+���������|&quot;col4&quot;:{&quot;cf&quot;:&quot;cf4&quot;, &quot;col&quot;:&quot;col4&quot;, &quot;type&quot;:&quot;int&quot;},
+���������|&quot;col5&quot;:{&quot;cf&quot;:&quot;cf5&quot;, &quot;col&quot;:&quot;col5&quot;, &quot;type&quot;:&quot;bigint&quot;},
+���������|&quot;col6&quot;:{&quot;cf&quot;:&quot;cf6&quot;, &quot;col&quot;:&quot;col6&quot;, &quot;type&quot;:&quot;smallint&quot;},
+���������|&quot;col7&quot;:{&quot;cf&quot;:&quot;cf7&quot;, &quot;col&quot;:&quot;col7&quot;, &quot;type&quot;:&quot;string&quot;},
+���������|&quot;col8&quot;:{&quot;cf&quot;:&quot;cf8&quot;, &quot;col&quot;:&quot;col8&quot;, &quot;type&quot;:&quot;tinyint&quot;}
+�������|}
+�����|}&quot;&quot;&quot;.stripMargin</code></pre>
+</div>
+</div>
+<div class="paragraph">
+<p>Catalog defines a mapping between HBase and Spark tables. There are two critical parts of this catalog.
+One is the rowkey definition and the other is the mapping between table column in Spark and
+the column family and column qualifier in HBase. The above defines a schema for a HBase table
+with name as table1, row key as key and a number of columns (col1 <code>-</code> col8). Note that the rowkey
+also has to be defined in details as a column (col0), which has a specific cf (rowkey).</p>
+</div>
+</div>
+<div class="sect2">
+<h3 id="_save_the_dataframe"><a class="anchor" href="#_save_the_dataframe"></a>86.2. Save the DataFrame</h3>
+<div class="listingblock">
+<div class="content">
+<pre class="CodeRay highlight"><code data-lang="scala">case class HBaseRecord(
+   col0: String,
+   col1: Boolean,
+   col2: Double,
+   col3: Float,
+   col4: Int, ������
+   col5: Long,
+   col6: Short,
+   col7: String,
+   col8: Byte)
+
+object HBaseRecord
+{ ������������������������������������������������������������������������������������������������������������
+   def apply(i: Int, t: String): HBaseRecord = {
+      val s = s&quot;&quot;&quot;row${&quot;%03d&quot;.format(i)}&quot;&quot;&quot; ������
+      HBaseRecord(s,
+      i % 2 == 0,
+      i.toDouble,
+      i.toFloat, �
+      i,
+      i.toLong,
+      i.toShort, �
+      s&quot;String$i: $t&quot;, �����
+      i.toByte)
+  }
+}
+
+val data = (0 to 255).map { i =&gt; �HBaseRecord(i, &quot;extra&quot;)}
+
+sc.parallelize(data).toDF.write.options(
+�Map(HBaseTableCatalog.tableCatalog -&gt; catalog, HBaseTableCatalog.newTable -&gt; &quot;5&quot;))
+�.format(&quot;org.apache.hadoop.hbase.spark &quot;)
+�.save()</code></pre>
+</div>
 </div>
 <div class="paragraph">
-<p>The HBase-Spark module includes support for Spark SQL and DataFrames, which allows
-you to write SparkSQL directly on HBase tables. In addition the HBase-Spark
-will push down query filtering logic to HBase.</p>
+<p><code>data</code> prepared by the user is a local Scala collection which has 256 HBaseRecord objects.
+<code>sc.parallelize(data)</code> function distributes <code>data</code> to form an RDD. <code>toDF</code> returns a DataFrame.
+<code>write</code> function returns a DataFrameWriter used to write the DataFrame to external storage
+systems (e.g. HBase here). Given a DataFrame with specified schema <code>catalog</code>, <code>save</code> function
+will create an HBase table with 5 regions and save the DataFrame inside.</p>
+</div>
+</div>
+<div class="sect2">
+<h3 id="_load_the_dataframe"><a class="anchor" href="#_load_the_dataframe"></a>86.3. Load the DataFrame</h3>
+<div class="listingblock">
+<div class="content">
+<pre class="CodeRay highlight"><code data-lang="scala">def withCatalog(cat: String): DataFrame = {
+  sqlContext
+  .read
+  .options(Map(HBaseTableCatalog.tableCatalog-&gt;cat))
+  .format(&quot;org.apache.hadoop.hbase.spark&quot;)
+  .load()
+}
+val df = withCatalog(catalog)</code></pre>
+</div>
 </div>
 <div class="paragraph">
-<p>In HBaseSparkConf, four parameters related to timestamp can be set. They are TIMESTAMP,
-MIN_TIMESTAMP, MAX_TIMESTAMP and MAX_VERSIONS respectively. Users can query records
-with different timestamps or time ranges with MIN_TIMESTAMP and MAX_TIMESTAMP.
-In the meantime, use concrete value instead of tsSpecified and oldMs in the examples below.</p>
+<p>In \u2018withCatalog\u2019 function, sqlContext is a variable of SQLContext, which is the entry point
+for working with structured data (rows and columns) in Spark.
+<code>read</code> returns a DataFrameReader that can be used to read data in as a DataFrame.
+<code>option</code> function adds input options for the underlying data source to the DataFrameReader,
+and <code>format</code> function specifies the input data source format for the DataFrameReader.
+The <code>load()</code> function loads input in as a DataFrame. The date frame <code>df</code> returned
+by <code>withCatalog</code> function could be used to access HBase table, such as 4.4 and 4.5.</p>
+</div>
+</div>
+<div class="sect2">
+<h3 id="_language_integrated_query"><a class="anchor" href="#_language_integrated_query"></a>86.4. Language Integrated Query</h3>
+<div class="listingblock">
+<div class="content">
+<pre class="CodeRay highlight"><code data-lang="scala">val s = df.filter(($&quot;col0&quot; &lt;= &quot;row050&quot; &amp;&amp; $&quot;col0&quot; &gt; &quot;row040&quot;) ||
+  $&quot;col0&quot; === &quot;row005&quot; ||
+  $&quot;col0&quot; &lt;= &quot;row005&quot;)
+  .select(&quot;col0&quot;, &quot;col1&quot;, &quot;col4&quot;)
+s.show</code></pre>
+</div>
+</div>
+<div class="paragraph">
+<p>DataFrame can do various operations, such as join, sort, select, filter, orderBy and so on.
+<code>df.filter</code> above filters rows using the given SQL expression. <code>select</code> selects a set of columns:
+<code>col0</code>, <code>col1</code> and <code>col4</code>.</p>
+</div>
+</div>
+<div class="sect2">
+<h3 id="_sql_query"><a class="anchor" href="#_sql_query"></a>86.5. SQL Query</h3>
+<div class="listingblock">
+<div class="content">
+<pre class="CodeRay highlight"><code data-lang="scala">df.registerTempTable(&quot;table1&quot;)
+sqlContext.sql(&quot;select count(col1) from table1&quot;).show</code></pre>
+</div>
+</div>
+<div class="paragraph">
+<p><code>registerTempTable</code> registers <code>df</code> DataFrame as a temporary table using the table name <code>table1</code>.
+The lifetime of this temporary table is tied to the SQLContext that was used to create <code>df</code>.
+<code>sqlContext.sql</code> function allows the user to execute SQL queries.</p>
+</div>
 </div>
+<div class="sect2">
+<h3 id="_others"><a class="anchor" href="#_others"></a>86.6. Others</h3>
 <div class="exampleblock">
 <div class="title">Example 52. Query with different timestamps</div>
 <div class="content">
 <div class="paragraph">
+<p>In HBaseSparkConf, four parameters related to timestamp can be set. They are TIMESTAMP,
+MIN_TIMESTAMP, MAX_TIMESTAMP and MAX_VERSIONS respectively. Users can query records with
+different timestamps or time ranges with MIN_TIMESTAMP and MAX_TIMESTAMP. In the meantime,
+use concrete value instead of tsSpecified and oldMs in the examples below.</p>
+</div>
+<div class="paragraph">
 <p>The example below shows how to load df DataFrame with different timestamps.
 tsSpecified is specified by the user.
 HBaseTableCatalog defines the HBase and Relation relation schema.
@@ -18867,10 +19182,10 @@ writeCatalog defines catalog for the schema mapping.</p>
 </div>
 <div class="listingblock">
 <div class="content">
-<pre>val df = sqlContext.read
+<pre class="CodeRay highlight"><code data-lang="scala">val df = sqlContext.read
       .options(Map(HBaseTableCatalog.tableCatalog -&gt; writeCatalog, HBaseSparkConf.TIMESTAMP -&gt; tsSpecified.toString))
-      .format("org.apache.hadoop.hbase.spark")
-      .load()</pre>
+      .format(&quot;org.apache.hadoop.hbase.spark&quot;)
+      .load()</code></pre>
 </div>
 </div>
 <div class="paragraph">
@@ -18879,11 +19194,11 @@ oldMs is specified by the user.</p>
 </div>
 <div class="listingblock">
 <div class="content">
-<pre>val df = sqlContext.read
-      .options(Map(HBaseTableCatalog.tableCatalog -&gt; writeCatalog, HBaseSparkConf.MIN_TIMESTAMP -&gt; "0",
+<pre class="CodeRay highlight"><code data-lang="scala">val df = sqlContext.read
+      .options(Map(HBaseTableCatalog.tableCatalog -&gt; writeCatalog, HBaseSparkConf.MIN_TIMESTAMP -&gt; &quot;0&quot;,
         HBaseSparkConf.MAX_TIMESTAMP -&gt; oldMs.toString))
-      .format("org.apache.hadoop.hbase.spark")
-      .load()</pre>
+      .format(&quot;org.apache.hadoop.hbase.spark&quot;)
+      .load()</code></pre>
 </div>
 </div>
 <div class="paragraph">
@@ -18891,162 +19206,149 @@ oldMs is specified by the user.</p>
 </div>
 <div class="listingblock">
 <div class="content">
-<pre>    df.registerTempTable("table")
-    sqlContext.sql("select count(col1) from table").show</pre>
+<pre class="CodeRay highlight"><code data-lang="scala">df.registerTempTable(&quot;table&quot;)
+sqlContext.sql(&quot;select count(col1) from table&quot;).show</code></pre>
 </div>
 </div>
 </div>
 </div>
-<div class="sect2">
-<h3 id="_predicate_push_down"><a class="anchor" href="#_predicate_push_down"></a>86.1. Predicate Push Down</h3>
+<div class="exampleblock">
+<div class="title">Example 53. Native Avro support</div>
+<div class="content">
 <div class="paragraph">
-<p>There are two examples of predicate push down in the HBase-Spark implementation.
-The first example shows the push down of filtering logic on the RowKey. HBase-Spark
-will reduce the filters on RowKeys down to a set of Get and/or Scan commands.</p>
-</div>
-<div class="admonitionblock note">
-<table>
-<tr>
-<td class="icon">
-<i class="fa icon-note" title="Note"></i>
-</td>
-<td class="content">
-The Scans are distributed scans, rather than a single client scan operation.
-</td>
-</tr>
-</table>
+<p>HBase-Spark Connector support different data formats like Avro, Jason, etc. The use case below
+shows how spark supports Avro. User can persist the Avro record into HBase directly. Internally,
+the Avro schema is converted to a native Spark Catalyst data type automatically.
+Note that both key-value parts in an HBase table can be defined in Avro format.</p>
 </div>
 <div class="paragraph">
-<p>If the query looks something like the following, the logic will push down and get
-the rows through 3 Gets and 0 Scans. We can do gets because all the operations
-are <code>equal</code> operations.</p>
+<p>1) Define catalog for the schema mapping:</p>
 </div>
 <div class="listingblock">
 <div class="content">
-<pre class="CodeRay highlight"><code data-lang="sql"><span class="class">SELECT</span>
-  KEY_FIELD,
-  B_FIELD,
-  A_FIELD
-<span class="keyword">FROM</span> hbaseTmp
-<span class="keyword">WHERE</span> (KEY_FIELD = <span class="string"><span class="delimiter">'</span><span class="content">get1</span><span class="delimiter">'</span></span> <span class="keyword">or</span> KEY_FIELD = <span class="string"><span class="delimiter">'</span><span class="content">get2</span><span class="delimiter">'</span></span> <span class="keyword">or</span> KEY_FIELD = <span class="string"><span class="delimiter">'</span><span class="content">get3</span><span class="delimiter">'</span></span>)</code></pre>
+<pre class="CodeRay highlight"><code data-lang="scala">def catalog = s&quot;&quot;&quot;{
+                     |&quot;table&quot;:{&quot;namespace&quot;:&quot;default&quot;, &quot;name&quot;:&quot;Avrotable&quot;},
+                      |&quot;rowkey&quot;:&quot;key&quot;,
+                      |&quot;columns&quot;:{
+                      |&quot;col0&quot;:{&quot;cf&quot;:&quot;rowkey&quot;, &quot;col&quot;:&quot;key&quot;, &quot;type&quot;:&quot;string&quot;},
+                      |&quot;col1&quot;:{&quot;cf&quot;:&quot;cf1&quot;, &quot;col&quot;:&quot;col1&quot;, &quot;type&quot;:&quot;binary&quot;}
+                      |}
+                      |}&quot;&quot;&quot;.stripMargin</code></pre>
 </div>
 </div>
 <div class="paragraph">
-<p>Now let&#8217;s look at an example where we will end up doing two scans on HBase.</p>
+<p><code>catalog</code> is a schema for a HBase table named <code>Avrotable</code>. row key as key and
+one column col1. The rowkey also has to be defined in details as a column (col0),
+which has a specific cf (rowkey).</p>
+</div>
+<div class="paragraph">
+<p>2) Prepare the Data:</p>
 </div>
 <div class="listingblock">
 <div class="content">
-<pre class="CodeRay highlight"><code data-lang="sql"><span class="class">SELECT</span>
-  KEY_FIELD,
-  B_FIELD,
-  A_FIELD
-<span class="keyword">FROM</span> hbaseTmp
-<span class="keyword">WHERE</span> KEY_FIELD &lt; <span class="string"><span class="delimiter">'</span><span class="content">get2</span><span class="delimiter">'</span></span> <span class="keyword">or</span> KEY_FIELD &gt; <span class="string"><span class="delimiter">'</span><span class="content">get3</span><span class="delimiter">'</span></span></code></pre>
+<pre class="CodeRay highlight"><code data-lang="scala"> object AvroHBaseRecord {
+   val schemaString =
+     s&quot;&quot;&quot;{&quot;namespace&quot;: &quot;example.avro&quot;,
+         |   &quot;type&quot;: &quot;record&quot;,      &quot;name&quot;: &quot;User&quot;,
+         |    &quot;fields&quot;: [
+         |        {&quot;name&quot;: &quot;name&quot;, &quot;type&quot;: &quot;string&quot;},
+         |        {&quot;name&quot;: &quot;favorite_number&quot;,  &quot;type&quot;: [&quot;int&quot;, &quot;null&quot;]},
+         |        {&quot;name&quot;: &quot;favorite_color&quot;, &quot;type&quot;: [&quot;string&quot;, &quot;null&quot;]},
+         |        {&quot;name&quot;: &quot;favorite_array&quot;, &quot;type&quot;: {&quot;type&quot;: &quot;array&quot;, &quot;items&quot;: &quot;string&quot;}},
+         |        {&quot;name&quot;: &quot;favorite_map&quot;, &quot;type&quot;: {&quot;type&quot;: &quot;map&quot;, &quot;values&quot;: &quot;int&quot;}}
+         |      ]    }&quot;&quot;&quot;.stripMargin
+
+   val avroSchema: Schema = {
+     val p = new Schema.Parser
+     p.parse(schemaString)
+   }
+
+   def apply(i: Int): AvroHBaseRecord = {
+     val user = new GenericData.Record(avroSchema);
+     user.put(&quot;name&quot;, s&quot;name${&quot;%03d&quot;.format(i)}&quot;)
+     user.put(&quot;favorite_number&quot;, i)
+     user.put(&quot;favorite_color&quot;, s&quot;color${&quot;%03d&quot;.format(i)}&quot;)
+     val favoriteArray = new GenericData.Array[String](2, avroSchema.getField(&quot;favorite_array&quot;).schema())
+     favoriteArray.add(s&quot;number${i}&quot;)
+     favoriteArray.add(s&quot;number${i+1}&quot;)
+     user.put(&quot;favorite_array&quot;, favoriteArray)
+     import collection.JavaConverters._
+     val favoriteMap = Map[String, Int]((&quot;key1&quot; -&gt; i), (&quot;key2&quot; -&gt; (i+1))).asJava
+     user.put(&quot;favorite_map&quot;, favoriteMap)
+     val avroByte = AvroSedes.serialize(user, avroSchema)
+     AvroHBaseRecord(s&quot;name${&quot;%03d&quot;.format(i)}&quot;, avroByte)
+   }
+ }
+
+ val data = (0 to 255).map { i =&gt;
+    AvroHBaseRecord(i)
+ }</code></pre>
 </div>
 </div>
 <div class="paragraph">
-<p>In this example we will get 0 Gets and 2 Scans. One scan will load everything
-from the first row in the table until \u201cget2\u201d and the second scan will get
-everything from \u201cget3\u201d until the last row in the table.</p>
+<p><code>schemaString</code> is defined first, then it is parsed to get <code>avroSchema</code>. <code>avroSchema</code> is used to
+generate <code>AvroHBaseRecord</code>. <code>data</code> prepared by users is a local Scala collection
+which has 256 <code>AvroHBaseRecord</code> objects.</p>
 </div>
 <div class="paragraph">
-<p>The next query is a good example of having a good deal of range checks. However
-the ranges overlap. To the code will be smart enough to get the following data
-in a single scan that encompasses all the data asked by the query.</p>
+<p>3) Save DataFrame:</p>
 </div>
 <div class="listingblock">
 <div class="content">
-<pre class="CodeRay highlight"><code data-lang="sql"><span class="class">SELECT</span>
-  KEY_FIELD,
-  B_FIELD,
-  A_FIELD
-<span class="keyword">FROM</span> hbaseTmp
-<span class="keyword">WHERE</span>
-  (KEY_FIELD &gt;= <span class="string"><span class="delimiter">'</span><span class="content">get1</span><span class="delimiter">'</span></span> <span class="keyword">and</span> KEY_FIELD &lt;= <span class="string"><span class="delimiter">'</span><span class="content">get3</span><span class="delimiter">'</span></span>) <span class="keyword">or</span>
-  (KEY_FIELD &gt; <span class="string"><span class="delimiter">'</span><span class="content">get3</span><span class="delimiter">'</span></span> <span class="keyword">and</span> KEY_FIELD &lt;= <span class="string"><span class="delimiter">'</span><span class="content">get5</span><span class="delimiter">'</span></span>)</code></pre>
+<pre class="CodeRay highlight"><code data-lang="scala"> sc.parallelize(data).toDF.write.options(
+     Map(HBaseTableCatalog.tableCatalog -&gt; catalog, HBaseTableCatalog.newTable -&gt; &quot;5&quot;))
+     .format(&quot;org.apache.spark.sql.execution.datasources.hbase&quot;)
+     .save()</code></pre>
 </div>
 </div>
 <div class="paragraph">
-<p>The second example of push down functionality offered by the HBase-Spark module
-is the ability to push down filter logic for column and cell fields. Just like
-the RowKey logic, all query logic will be consolidated into the minimum number
-of range checks and equal checks by sending a Filter object along with the Scan
-with information about consolidated push down predicates</p>
+<p>Given a data frame with specified schema <code>catalog</code>, above will create an HBase table with 5
+regions and save the data frame inside.</p>
 </div>
-<div class="exampleblock">
-<div class="title">Example 53. SparkSQL Code Example</div>
-<div class="content">
 <div class="paragraph">
-<p>This example shows how we can interact with HBase with SQL.</p>
+<p>4) Load the DataFrame</p>
 </div>
 <div class="listingblock">
 <div class="content">
-<pre class="CodeRay highlight"><code data-lang="scala">val sc = new SparkContext(&quot;local&quot;, &quot;test&quot;)
-val config = new HBaseConfiguration()
-
-new HBaseContext(sc, TEST_UTIL.getConfiguration)
-val sqlContext = new SQLContext(sc)
+<pre class="CodeRay highlight"><code data-lang="scala">def avroCatalog = s&quot;&quot;&quot;{
+            |&quot;table&quot;:{&quot;namespace&quot;:&quot;default&quot;, &quot;name&quot;:&quot;avrotable&quot;},
+            |&quot;rowkey&quot;:&quot;key&quot;,
+            |&quot;columns&quot;:{
+              |&quot;col0&quot;:{&quot;cf&quot;:&quot;rowkey&quot;, &quot;col&quot;:&quot;key&quot;, &quot;type&quot;:&quot;string&quot;},
+              |&quot;col1&quot;:{&quot;cf&quot;:&quot;cf1&quot;, &quot;col&quot;:&quot;col1&quot;, &quot;avro&quot;:&quot;avroSchema&quot;}
+            |}
+          |}&quot;&quot;&quot;.stripMargin
 
-df = sqlContext.load(&quot;org.apache.hadoop.hbase.spark&quot;,
-  Map(&quot;hbase.columns.mapping&quot; -&gt;
-   &quot;KEY_FIELD STRING :key, A_FIELD STRING c:a, B_FIELD STRING c:b&quot;,
-   &quot;hbase.table&quot; -&gt; &quot;t1&quot;))
-
-df.registerTempTable(&quot;hbaseTmp&quot;)
-
-val results = sqlContext.sql(&quot;SELECT KEY_FIELD, B_FIELD FROM hbaseTmp &quot; +
-  &quot;WHERE &quot; +
-  &quot;(KEY_FIELD = 'get1' and B_FIELD &lt; '3') or &quot; +
-  &quot;(KEY_FIELD &gt;= 'get3' and B_FIELD = '8')&quot;).take(5)</code></pre>
+ def withCatalog(cat: String): DataFrame = {
+     sqlContext
+         .read
+         .options(Map(&quot;avroSchema&quot; -&gt; AvroHBaseRecord.schemaString, HBaseTableCatalog.tableCatalog -&gt; avroCatalog))
+         .format(&quot;org.apache.spark.sql.execution.datasources.hbase&quot;)
+         .load()
+ }
+ val df = withCatalog(catalog)</code></pre>
 </div>
 </div>
 <div class="paragraph">
-<p>There are three major parts of this example that deserve explaining.</p>
-</div>
-<div class="dlist">
-<dl>
-<dt class="hdlist1">The sqlContext.load function</dt>
-<dd>
-<p>In the sqlContext.load function we see two
-parameters. The first of these parameters is pointing Spark to the HBase
-DefaultSource class that will act as the interface between SparkSQL and HBase.</p>
-</dd>
-<dt class="hdlist1">A map of key value pairs</dt>
-<dd>
-<p>In this example we have two keys in our map, <code>hbase.columns.mapping</code> and
-<code>hbase.table</code>. The <code>hbase.table</code> directs SparkSQL to use the given HBase table.
-The <code>hbase.columns.mapping</code> key give us the logic to translate HBase columns to
-SparkSQL columns.</p>
-<div class="paragraph">
-<p>The <code>hbase.columns.mapping</code> is a string that follows the following format</p>
-</div>
-<div class="listingblock">
-<div class="content">
-<pre class="CodeRay highlight"><code data-lang="scala">(SparkSQL.ColumnName) (SparkSQL.ColumnType) (HBase.ColumnFamily):(HBase.Qualifier)</code></pre>
-</div>
+<p>In <code>withCatalog</code> function, <code>read</code> returns a DataFrameReader that can be used to read data in as a DataFrame.
+The <code>option</code> function adds input options for the underlying data source to the DataFrameReader.
+There are two options: one is to set <code>avroSchema</code> as <code>AvroHBaseRecord.schemaString</code>, and one is to
+set <code>HBaseTableCatalog.tableCatalog</code> as <code>avroCatalog</code>. The <code>load()</code> function loads input in as a DataFrame.
+The date frame <code>df</code> returned by <code>withCatalog</code> function could be used to access the HBase table.</p>
 </div>
 <div class="paragraph">
-<p>In the example below we see the definition of three fields. Because KEY_FIELD has
-no ColumnFamily, it is the RowKey.</p>
+<p>5) SQL Query</p>
 </div>
 <div class="listingblock">
 <div class="content">
-<pre>KEY_FIELD STRING :key, A_FIELD STRING c:a, B_FIELD STRING c:b</pre>
-</div>
+<pre class="CodeRay highlight"><code data-lang="scala"> df.registerTempTable(&quot;avrotable&quot;)
+ val c = sqlContext.sql(&quot;select count(1) from avrotable&quot;).</code></pre>
 </div>
-</dd>
-<dt class="hdlist1">The registerTempTable function</dt>
-<dd>
-<p>This is a SparkSQL function that allows us now to be free of Scala when accessing
-our HBase table directly with SQL with the table name of "hbaseTmp".</p>
-</dd>
-</dl>
 </div>
 <div class="paragraph">
-<p>The last major point to note in the example is the <code>sqlContext.sql</code> function, which
-allows the user to ask their questions in SQL which will be pushed down to the
-DefaultSource code in the HBase-Spark module. The result of this command will be
-a DataFrame with the Schema of KEY_FIELD and B_FIELD.</p>
+<p>After loading df DataFrame, users can query data. registerTempTable registers df DataFrame
+as a temporary table using the table name avrotable. <code>sqlContext.sql</code> function allows the
+user to execute SQL queries.</p>
 </div>
 </div>
 </div>
@@ -19678,7 +19980,7 @@ and <code>salaryDet</code>, containing personal and salary details. Below is the
 of the <code>users</code> table.</p>
 </div>
 <table class="tableblock frame-all grid-all spread">
-<caption class="title">Table 11. Users Table</caption>
+<caption class="title">Table 17. Users Table</caption>
 <colgroup>
 <col style="width: 14%;">
 <col style="width: 14%;">
@@ -28353,7 +28655,7 @@ End-of-life releases are not included in this list.
 </table>
 </div>
 <table class="tableblock frame-all grid-all spread">
-<caption class="title">Table 12. Release Managers</caption>
+<caption class="title">Table 18. Release Managers</caption>
 <colgroup>
 <col style="width: 50%;">
 <col style="width: 50%;">
@@ -30478,7 +30780,7 @@ The following cheat sheet is included for your reference. More nuanced and compr
 is available at <a href="http://asciidoctor.org/docs/user-manual/" class="bare">http://asciidoctor.org/docs/user-manual/</a>.</p>
 </div>
 <table class="tableblock frame-all grid-all spread">
-<caption class="title">Table 13. AsciiDoc Cheat Sheet</caption>
+<caption class="title">Table 19. AsciiDoc Cheat Sheet</caption>
 <colgroup>
 <col style="width: 33%;">
 <col style="width: 33%;">
@@ -31529,7 +31831,7 @@ In case the table goes out of date, the unit tests which check for accuracy of p
 </dl>
 </div>
 <table class="tableblock frame-all grid-all spread">
-<caption class="title">Table 14. ACL Matrix</caption>
+<caption class="title">Table 20. ACL Matrix</caption>
 <colgroup>
 <col style="width: 33%;">
 <col style="width: 33%;">
@@ -32991,7 +33293,7 @@ Note that the size of the trailer is different depending on the version, so it i
 However, the version is always stored as the last four-byte integer in the file.</p>
 </div>
 <table class="tableblock frame-all grid-all spread">
-<caption class="title">Table 15. Differences between HFile Versions 1 and 2</caption>
+<caption class="title">Table 21. Differences between HFile Versions 1 and 2</caption>
 <colgroup>
 <col style="width: 50%;">
 <col style="width: 50%;">

http://git-wip-us.apache.org/repos/asf/hbase-site/blob/975096b1/bulk-loads.html
----------------------------------------------------------------------
diff --git a/bulk-loads.html b/bulk-loads.html
index 6b09bad..b4c1ea7 100644
--- a/bulk-loads.html
+++ b/bulk-loads.html
@@ -7,7 +7,7 @@
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20160712" />
+    <meta name="Date-Revision-yyyymmdd" content="20160714" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache HBase &#x2013;  
       Bulk Loads in Apache HBase (TM)
@@ -305,7 +305,7 @@ under the License. -->
                         <a href="http://www.apache.org/">The Apache Software Foundation</a>.
             All rights reserved.      
                     
-                  <li id="publishDate" class="pull-right">Last Published: 2016-07-12</li>
+                  <li id="publishDate" class="pull-right">Last Published: 2016-07-14</li>
             </p>
                 </div>