You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@phoenix.apache.org by td...@apache.org on 2015/07/29 23:53:48 UTC

svn commit: r1693353 - in /phoenix/site/publish: dynamic_columns.html language/index.html recent.html secondary_indexing.html

Author: tdsilva
Date: Wed Jul 29 21:53:48 2015
New Revision: 1693353

URL: http://svn.apache.org/r1693353
Log:
Added information about the MR index build, how to upsert dynamic columns and other minor fixes

Modified:
    phoenix/site/publish/dynamic_columns.html
    phoenix/site/publish/language/index.html
    phoenix/site/publish/recent.html
    phoenix/site/publish/secondary_indexing.html

Modified: phoenix/site/publish/dynamic_columns.html
URL: http://svn.apache.org/viewvc/phoenix/site/publish/dynamic_columns.html?rev=1693353&r1=1693352&r2=1693353&view=diff
==============================================================================
--- phoenix/site/publish/dynamic_columns.html (original)
+++ phoenix/site/publish/dynamic_columns.html Wed Jul 29 21:53:48 2015
@@ -155,10 +155,17 @@ WHERE eventType = 'OOM' AND lastGCTime &
  <pre>CREATE TABLE EventLog (
     eventId BIGINT NOT NULL,
     eventTime TIME NOT NULL,
-    eventType CHAR(3) NOT NULL
+    eventType CHAR(3)
     CONSTRAINT pk PRIMARY KEY (eventId, eventTime))
 </pre>
 </div>
+<p>To upsert a row with dynamic columns:</p> 
+<div class="source"> 
+ <pre>UPSERT INTO EventLog (eventId, eventTime, eventType, lastGCTime TIME, usedMemory BIGINT, maxMemory BIGINT)
+      VALUES(1, CURRENT_TIME(), 'abc', CURRENT_TIME(), 512, 1024);
+</pre>
+</div>
+
 			</div>
 		</div>
 	</div>

Modified: phoenix/site/publish/language/index.html
URL: http://svn.apache.org/viewvc/phoenix/site/publish/language/index.html?rev=1693353&r1=1693352&r2=1693353&view=diff
==============================================================================
--- phoenix/site/publish/language/index.html (original)
+++ phoenix/site/publish/language/index.html Wed Jul 29 21:53:48 2015
@@ -570,7 +570,7 @@ syntax-end -->
 <p>Creates a new table. The <code>HBase</code> table and any column families referenced are created if they don&#39;t already exist. All table, column family and column names are uppercased unless they are double quoted in which case they are case sensitive. Column families that exist in the <code>HBase</code> table but are not listed are ignored. At create time, to improve query performance, an empty key value is added to the first column family of any existing rows or the default column family if no column families are explicitly defined. Upserts will also add this empty key value. This improves query performance by having a key value column we can guarantee always being there and thus minimizing the amount of data that must be projected and subsequently returned back to the client. <code>HBase</code> table and column configuration options may be passed through as key/value pairs to configure the <code>HBase</code> table as desired. Note that when using the <code>IF NOT EXISTS</co
 de> clause, if a table already exists, then no change will be made to it. Additionally, no validation is done to check whether the existing table metadata matches the proposed table metadata. so it&#39;s better to use <code>DROP TABLE</code> followed by <code>CREATE TABLE</code> is the table metadata may be changing.</p>
 <p>Example:</p>
 <p class="notranslate">
-CREATE TABLE my_schema.my_table ( id BIGINT not null primary key, date DATE not null)<br />CREATE TABLE my_table ( id INTEGER not null primary key desc, date DATE not null,<br />&nbsp;&nbsp;&nbsp;&nbsp;m.db_utilization DECIMAL, i.db_utilization)<br />&nbsp;&nbsp;&nbsp;&nbsp;m.DATA_BLOCK_ENCODING=&#39;DIFF&#39;<br />CREATE TABLE stats.prod_metrics ( host char(50) not null, created_date date not null,<br />&nbsp;&nbsp;&nbsp;&nbsp;txn_count bigint CONSTRAINT pk PRIMARY KEY (host, created_date) )<br />CREATE TABLE IF NOT EXISTS &quot;my_case_sensitive_table&quot;<br />&nbsp;&nbsp;&nbsp;&nbsp;( &quot;id&quot; char(10) not null primary key, &quot;value&quot; integer)<br />&nbsp;&nbsp;&nbsp;&nbsp;DATA_BLOCK_ENCODING=&#39;NONE&#39;,VERSIONS=5,MAX_FILESIZE=2000000 split on (?, ?, ?)<br />CREATE TABLE IF NOT EXISTS my_schema.my_table (<br />&nbsp;&nbsp;&nbsp;&nbsp;org_id CHAR(15), entity_id CHAR(15), payload binary(1000),<br />&nbsp;&nbsp;&nbsp;&nbsp;CONSTRAINT pk PRIMARY KEY (org_id, entity_
 id) )<br />&nbsp;&nbsp;&nbsp;&nbsp;TTL=86400</p>
+CREATE TABLE my_schema.my_table ( id BIGINT not null primary key, date DATE)<br />CREATE TABLE my_table ( id INTEGER not null primary key desc, date DATE,<br />&nbsp;&nbsp;&nbsp;&nbsp;m.db_utilization DECIMAL, i.db_utilization)<br />&nbsp;&nbsp;&nbsp;&nbsp;m.DATA_BLOCK_ENCODING=&#39;DIFF&#39;<br />CREATE TABLE stats.prod_metrics ( host char(50) not null, created_date date,<br />&nbsp;&nbsp;&nbsp;&nbsp;txn_count bigint CONSTRAINT pk PRIMARY KEY (host, created_date) )<br />CREATE TABLE IF NOT EXISTS &quot;my_case_sensitive_table&quot;<br />&nbsp;&nbsp;&nbsp;&nbsp;( &quot;id&quot; char(10) not null primary key, &quot;value&quot; integer)<br />&nbsp;&nbsp;&nbsp;&nbsp;DATA_BLOCK_ENCODING=&#39;NONE&#39;,VERSIONS=5,MAX_FILESIZE=2000000 split on (?, ?, ?)<br />CREATE TABLE IF NOT EXISTS my_schema.my_table (<br />&nbsp;&nbsp;&nbsp;&nbsp;org_id CHAR(15), entity_id CHAR(15), payload binary(1000),<br />&nbsp;&nbsp;&nbsp;&nbsp;CONSTRAINT pk PRIMARY KEY (org_id, entity_id) )<br />&nbsp;&nbsp;&nbs
 p;&nbsp;TTL=86400</p>
 
 <h3 id="drop_table" class="notranslate">DROP TABLE</h3>
 <!-- railroad-start -->
@@ -746,10 +746,11 @@ ALTER TABLE my_schema.my_table ADD d.dep
 CREATE INDEX [IF NOT EXISTS] <a href="index.html#name">indexName</a>
 ON <a href="index.html#table_ref">tableRef</a> ( <a href="index.html#expression">expression</a> [ASC | DESC] [,...] )
 [ INCLUDE ( <a href="index.html#column_ref">columnRef</a> [,...] ) ]
+[ASYNC]
 [<a href="index.html#options">indexOptions</a>] [ SPLIT ON ( <a href="index.html#split_point">splitPoint</a> [,...] ) ]
 </pre>
 <div name="railroad">
-<table class="railroad"><tr class="railroad"><td class="d"><code class="c">CREATE INDEX</code></td><td class="d"><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr class="railroad"><td class="ls"></td><td class="d"><table class="railroad"><tr class="railroad"><td class="d"><code class="c">IF NOT EXISTS</code></td></tr></table></td><td class="le"></td></tr></table></td><td class="d"><code class="c"><a href="index.html#name">indexName</a></code></td></tr></table><br /><table class="railroad"><tr class="railroad"><td class="d"><code class="c">ON <a href="index.html#table_ref">tableRef</a> ( <a href="index.html#expression">expression</a></code></td><td class="d"><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr class="railroad"><td class="ls"></td><td class="d"><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d"><code class="
 c">ASC</code></td><td class="te"></td></tr><tr class="railroad"><td class="ls"></td><td class="d"><code class="c">DESC</code></td><td class="le"></td></tr></table></td><td class="le"></td></tr></table></td><td class="d"><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr class="railroad"><td class="ls"></td><td class="d"><code class="c">, ...</code></td><td class="le"></td></tr></table></td><td class="d"><code class="c">)</code></td></tr></table><br /><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr class="railroad"><td class="ls"></td><td class="d"><table class="railroad"><tr class="railroad"><td class="d"><code class="c">INCLUDE ( <a href="index.html#column_ref">columnRef</a></code></td><td class="d"><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr class="railroad"><td class="l
 s"></td><td class="d"><code class="c">, ...</code></td><td class="le"></td></tr></table></td><td class="d"><code class="c">)</code></td></tr></table></td><td class="le"></td></tr></table><br /><table class="railroad"><tr class="railroad"><td class="d"><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr class="railroad"><td class="ls"></td><td class="d"><code class="c"><a href="index.html#options">indexOptions</a></code></td><td class="le"></td></tr></table></td><td class="d"><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr class="railroad"><td class="ls"></td><td class="d"><table class="railroad"><tr class="railroad"><td class="d"><code class="c">SPLIT ON ( <a href="index.html#split_point">splitPoint</a></code></td><td class="d"><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr cl
 ass="railroad"><td class="ls"></td><td class="d"><code class="c">, ...</code></td><td class="le"></td></tr></table></td><td class="d"><code class="c">)</code></td></tr></table></td><td class="le"></td></tr></table></td></tr></table>
+<table class="railroad"><tr class="railroad"><td class="d"><code class="c">CREATE INDEX</code></td><td class="d"><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr class="railroad"><td class="ls"></td><td class="d"><table class="railroad"><tr class="railroad"><td class="d"><code class="c">IF NOT EXISTS</code></td></tr></table></td><td class="le"></td></tr></table></td><td class="d"><code class="c"><a href="index.html#name">indexName</a></code></td></tr></table><br /><table class="railroad"><tr class="railroad"><td class="d"><code class="c">ON <a href="index.html#table_ref">tableRef</a> ( <a href="index.html#expression">expression</a></code></td><td class="d"><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr class="railroad"><td class="ls"></td><td class="d"><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d"><code class="
 c">ASC</code></td><td class="te"></td></tr><tr class="railroad"><td class="ls"></td><td class="d"><code class="c">DESC</code></td><td class="le"></td></tr></table></td><td class="le"></td></tr></table></td><td class="d"><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr class="railroad"><td class="ls"></td><td class="d"><code class="c">, ...</code></td><td class="le"></td></tr></table></td><td class="d"><code class="c">)</code></td></tr></table><br /><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr class="railroad"><td class="ls"></td><td class="d"><table class="railroad"><tr class="railroad"><td class="d"><code class="c">INCLUDE ( <a href="index.html#column_ref">columnRef</a></code></td><td class="d"><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr class="railroad"><td class="l
 s"></td><td class="d"><code class="c">, ...</code></td><td class="le"></td></tr></table></td><td class="d"><code class="c">)</code></td></tr></table></td><td class="le"></td></tr></table><br /><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr class="railroad"><td class="ls"></td><td class="d"><code class="c">ASYNC</code></td><td class="le"></td></tr></table></td></td></tr></table></td><td class="le"></td></tr></table><br /><table class="railroad"><tr class="railroad"><td class="d"><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr class="railroad"><td class="ls"></td><td class="d"><code class="c"><a href="index.html#options">indexOptions</a></code></td><td class="le"></td></tr></table></td><td class="d"><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr class="railroad"><td class="
 ls"></td><td class="d"><table class="railroad"><tr class="railroad"><td class="d"><code class="c">SPLIT ON ( <a href="index.html#split_point">splitPoint</a></code></td><td class="d"><table class="railroad"><tr class="railroad"><td class="ts"></td><td class="d">&nbsp;</td><td class="te"></td></tr><tr class="railroad"><td class="ls"></td><td class="d"><code class="c">, ...</code></td><td class="le"></td></tr></table></td><td class="d"><code class="c">)</code></td></tr></table></td><td class="le"></td></tr></table></td></tr></table>
 </div>
 <!-- railroad-end -->
 <!-- syntax-start
@@ -760,7 +761,7 @@ ON <a href="index.html#table_ref">tableR
 [<a href="index.html#options">indexOptions</a>] [ SPLIT ON ( <a href="index.html#split_point">splitPoint</a> [,...] ) ]
 </pre>
 syntax-end -->
-<p>Creates a new secondary index on a table or view. The index will be automatically kept in sync with the table as the data changes. At query time, the optimizer will use the index if it contains all columns referenced in the query and produces the most efficient execution plan. If a table has rows that are write-once and append-only, then the table may set the <code>IMMUTABLE_ROWS</code> property to true (either up-front in the <code>CREATE TABLE</code> statement or afterwards in an <code>ALTER TABLE</code> statement). This reduces the overhead at write time to maintain the index. Otherwise, if this property is not set on the table, then incremental index maintenance will be performed on the server side when the data changes. As of the 4.3 release, functional indexes are supported which allow arbitrary expressions rather than solely column names to be indexed.</p>
+<p>Creates a new secondary index on a table or view. The index will be automatically kept in sync with the table as the data changes. At query time, the optimizer will use the index if it contains all columns referenced in the query and produces the most efficient execution plan. If a table has rows that are write-once and append-only, then the table may set the <code>IMMUTABLE_ROWS</code> property to true (either up-front in the <code>CREATE TABLE</code> statement or afterwards in an <code>ALTER TABLE</code> statement). This reduces the overhead at write time to maintain the index. Otherwise, if this property is not set on the table, then incremental index maintenance will be performed on the server side when the data changes. As of the 4.3 release, functional indexes are supported which allow arbitrary expressions rather than solely column names to be indexed. As of the 4.4.0 release, you can specify the <code>ASYNC</code> keyword to create the index asynchronously.</p>
 <p>Example:</p>
 <p class="notranslate">
 CREATE INDEX my_idx ON sales.opportunity(last_updated_date DESC)<br />CREATE INDEX my_idx ON log.event(created_date DESC) INCLUDE (name, payload) SALT_BUCKETS=10<br />CREATE INDEX IF NOT EXISTS my_comp_idx ON server_metrics ( gc_time DESC, created_date DESC )<br />&nbsp;&nbsp;&nbsp;&nbsp;DATA_BLOCK_ENCODING=&#39;NONE&#39;,VERSIONS=?,MAX_FILESIZE=2000000 split on (?, ?, ?)<br />CREATE INDEX my_idx ON sales.opportunity(UPPER(contact_name))</p>

Modified: phoenix/site/publish/recent.html
URL: http://svn.apache.org/viewvc/phoenix/site/publish/recent.html?rev=1693353&r1=1693352&r2=1693353&view=diff
==============================================================================
--- phoenix/site/publish/recent.html (original)
+++ phoenix/site/publish/recent.html Wed Jul 29 21:53:48 2015
@@ -144,7 +144,8 @@
 </div> 
 <p>As items are implemented from our road map, they are moved here to track the progress we’ve made:</p> 
 <ol style="list-style-type: decimal"> 
- <li><b><a href="udf.html">User Defined Functions</a></b>. Allows users to create and deploy their own custom or domain-specific user-defined functions to the cluster. <b>Available in our 4.4 release</b></li> 
+ <li><b><a href="udf.html">User Defined Functions</a></b>. Allows users to create and deploy their own custom or domain-specific user-defined functions to the cluster. <b>Available in our 4.4 release</b></li>
+ <li><b><a href="secondary_indexing.html#MR_Index_Build">Map Reduce Index Build</a></b>. Enables an index to be created asynchronously using a map reduce job <b>Available in our 4.4 release</b></li> 
  <li><b><a href="secondary_indexing.html#Functional_Indexes">Functional Indexes</a></b>. Enables an index to be defined as expressions as opposed to just column names and have the index be used when a query contains this expression. <b>Available in our 4.3 release</b></li> 
  <li><b><a href="phoenix_mr.html">Map-reduce Integration</a></b>. Support general map-reduce integration to Phoenix by implementing custom input and output formats.</li> 
  <li><b><a href="update_statistics.html">Statistics Collection</a></b>. Collects the statistics for a table to improve query parallelization. <b>Available in our 3.2/4.2 release</b></li> 

Modified: phoenix/site/publish/secondary_indexing.html
URL: http://svn.apache.org/viewvc/phoenix/site/publish/secondary_indexing.html?rev=1693353&r1=1693352&r2=1693353&view=diff
==============================================================================
--- phoenix/site/publish/secondary_indexing.html (original)
+++ phoenix/site/publish/secondary_indexing.html Wed Jul 29 21:53:48 2015
@@ -166,6 +166,15 @@
  <p>All indexes on a table declared with <tt>IMMUTABLE_ROWS=true</tt> are considered immutable (note that by default, tables are considered mutable). For global immutable indexes, the index is maintained entirely on the client-side with the index table being generated as change to the data table occur. Local immutable indexes, on the other hand, are maintained on the server-side. Note that no safeguards are in-place to enforce that a table declared as immutable doesn’t actually mutate data (as that would negate the performance gain achieved). If that was to occur, the index would no longer be in sync with the table.</p> 
 </div> 
 <div class="section"> 
+ <h2 id="MR_Index_Build">Map Reduce Index Build</h2> 
+ <p>As of the 4.4.0 release it is possible to use a map reduce job to create an index asynchronously by including the <tt>ASYNC</tt> keyword in the index creation DDL statement:</p> 
+ <div class="source"> 
+  <pre>CREATE INDEX async_index ON my_table (v) ASYNC;
+</pre> 
+ </div> 
+ <p>The map reduce job that populates the index table must be kicked off separately.</p> 
+</div> 
+<div class="section"> 
  <h2 id="Examples">Examples</h2> 
  <p>Given the schema shown here:</p> 
  <div class="source"> 
@@ -210,12 +219,14 @@ CREATE LOCAL INDEX my_index ON my_table
   <div class="source"> 
    <pre>CREATE INDEX upper_name_idx ON employee (UPPER(name)) INCLUDE(name);
 </pre> 
-  </div> 
-  <p>With this index in place, when the following query is issued, the index would be used instead of the data table to retrieve the results:</p> 
+  </div>
+<div class="section"> 
+  <h3 id="MR_Index">Map Reduce Index Build</h3> 
+  <p>The IndexTool class is called to create a map reduce job that populates the index. It can be triggered via the HBase command line binary. For example:</p> 
   <div class="source"> 
-   <pre>SELECT id, name FROM employee WHERE UPPER(NAME)='John Doe';
+   <pre>${HBASE_HOME}/bin/hbase org.apache.phoenix.mapreduce.index.IndexTool -dt DATA_TABLE -it ASYNC_IDX  -op ASYNC_IDX_HFILES
 </pre> 
-  </div> 
+  </div>  
  </div> 
  <div class="section"> 
   <h3 id="Index_Sort_Order">Index Sort Order</h3>