You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@phoenix.apache.org by ja...@apache.org on 2014/04/21 19:51:56 UTC

svn commit: r1588939 - in /incubator/phoenix/site: publish/pig_integration.html source/src/site/markdown/pig_integration.md

Author: jamestaylor
Date: Mon Apr 21 17:51:56 2014
New Revision: 1588939

URL: http://svn.apache.org/r1588939
Log:
Updates to Pig Integration (Ravi)

Modified:
    incubator/phoenix/site/publish/pig_integration.html
    incubator/phoenix/site/source/src/site/markdown/pig_integration.md

Modified: incubator/phoenix/site/publish/pig_integration.html
URL: http://svn.apache.org/viewvc/incubator/phoenix/site/publish/pig_integration.html?rev=1588939&r1=1588938&r2=1588939&view=diff
==============================================================================
--- incubator/phoenix/site/publish/pig_integration.html (original)
+++ incubator/phoenix/site/publish/pig_integration.html Mon Apr 21 17:51:56 2014
@@ -1,7 +1,7 @@
 
 <!DOCTYPE html>
 <!--
- Generated by Apache Maven Doxia at 2014-04-20
+ Generated by Apache Maven Doxia at 2014-04-21
  Rendered using Reflow Maven Skin 1.1.0 (http://andriusvelykis.github.io/reflow-maven-skin)
 -->
 <html  xml:lang="en" lang="en">
@@ -143,22 +143,39 @@ STORE A into 'hbase://CORE.ENTITY_HISTOR
 <div class="section"> 
  <h2 id="Pig_Loader">Pig Loader</h2> 
  <p>A Pig data loader allows users to read data from Phoenix backed HBase tables within a Pig script. </p> 
- <p>The Load func provides two alternative ways to load data. 1. Given a Table Name A = load ‘<a class="externalLink" href="hbase://table/HIRES">hbase://table/HIRES</a>’ using org.apache.phoenix.pig.PhoenixHBaseLoader(‘localhost’);</p> 
+ <p>The Load func provides two alternative ways to load data.</p> 
  <div class="source"> 
-  <pre>The above loads the data for all the columns in HIRES table.
-To restrict the list of columns , you can specify the column names as part of LOAD as below
-   A = load 'hbase://table/HIRES/ID,NAME'  using org.apache.phoenix.pig.PhoenixHBaseLoader('localhost');
- Here, only data for ID and NAME columns are returned.
+  <pre>  1. Given a Table Name
+    A = load 'hbase://table/HIRES'  using org.apache.phoenix.pig.PhoenixHBaseLoader('localhost');
+       The above loads the data for all the columns in HIRES table.
+
+    To restrict the list of columns , you can specify the column names as part of LOAD as below
+    A = load 'hbase://table/HIRES/ID,NAME'  using org.apache.phoenix.pig.PhoenixHBaseLoader('localhost');
+       Here, only data for ID and NAME columns are returned.
+
+  2. Given a Query   
+    A = load 'hbase://query/SELECT ID,NAME FROM HIRES WHERE AGE &gt; 50' using org.apache.phoenix.pig.PhoenixHBaseLoader('localhost');
+       The above query loads data of all those rows whose AGE column has a value &gt; 50 . The LOAD func merely executes the given SQL query and returns the results. 
+
+    Though there is a provision to provide a query as part of LOAD, it is more restrictive to the following
+    a) Should be a SELECT query only.
+    b) Shouldn't contain any GROUP BY , ORDER BY , LIMIT , DISTINCT clauses within the query.
+    c) Shouldn't contain any of AGGREGATE functions.
 </pre> 
  </div> 
- <ol style="list-style-type: decimal"> 
-  <li>Given a Query A = load ‘<a class="externalLink" href="hbase://query/SELECT">hbase://query/SELECT</a> ID,NAME FROM HIRES WHERE AGE &gt; 50’ using org.apache.phoenix.pig.PhoenixHBaseLoader(‘localhost’); The above query loads data of all those rows whose AGE column has a value &gt; 50 . The LOAD func merely executes the given SQL query and returns the results. Though there is a provision to provide a query as part of LOAD, it is more restrictive to the following a) Should be a SELECT query only. b) Shouldn’t contain any GROUP BY , ORDER BY , LIMIT , DISTINCT clauses within the query. c) Shouldn’t contain any of AGGREGATE functions.</li> 
- </ol> 
  <p>In both the cases, the zookeeper quorum should be passed to the PhoenixHBaseLoader as an argument to the constructor. </p> 
  <p>The Loadfunc makes best effort to map Phoenix Data Types to Pig datatype. You can have a look at org.apache.phoenix.pig.util.TypeUtil to see how each of Phoenix data type is mapped to Pig data type.</p> 
- <p>TODOS: With Phoenix 3.0 , we provide support for a ARRAY data type. However , this is not yet supported within Pig Loader. Usage of String, Date functions within the provided SQL Query.</p> 
- <p>Example Usage: Goal : Determine the number of users by a CLIENT ID. Ddl: CREATE TABLE HIRES( CLIENTID INTEGER NOT NULL, EMPID INTEGER NOT NULL, NAME VARCHAR CONSTRAINT pk PRIMARY KEY(CLIENTID,EMPID)); Pig Script:</p> 
- <p>raw = LOAD ‘<a class="externalLink" href="hbase://table/HIRES">hbase://table/HIRES</a> USING org.apache.phoenix.pig.PhoenixHBaseLoader(‘localhost’)’; grpd = GROUP raw BY CLIENTID; cnt = FOREACH grpd GENERATE group AS CLIENT,COUNT(raw); DUMP cnt; </p> 
+ <p>####TODO 1. Support for ARRAY data type. 2. Usage of String, Date functions within the provided SQL Query.</p> 
+ <p>####Example : <b>Goal:</b> Determine the number of users by a CLIENT ID.</p> 
+ <p><b>Ddl:</b> CREATE TABLE HIRES( CLIENTID INTEGER NOT NULL, EMPID INTEGER NOT NULL, NAME VARCHAR CONSTRAINT pk PRIMARY KEY(CLIENTID,EMPID));</p> 
+ <p><b>Pig Script:</b> </p> 
+ <div class="source"> 
+  <pre>    raw = LOAD 'hbase://table/HIRES USING org.apache.phoenix.pig.PhoenixHBaseLoader('localhost')';
+    grpd = GROUP raw BY CLIENTID; 
+    cnt = FOREACH grpd GENERATE group AS CLIENT,COUNT(raw);
+    DUMP cnt;  
+</pre> 
+ </div> 
 </div>
 			</div>
 		</div>

Modified: incubator/phoenix/site/source/src/site/markdown/pig_integration.md
URL: http://svn.apache.org/viewvc/incubator/phoenix/site/source/src/site/markdown/pig_integration.md?rev=1588939&r1=1588938&r2=1588939&view=diff
==============================================================================
--- incubator/phoenix/site/source/src/site/markdown/pig_integration.md (original)
+++ incubator/phoenix/site/source/src/site/markdown/pig_integration.md Mon Apr 21 17:51:56 2014
@@ -24,36 +24,40 @@ For example, let’s assume we are wr
 A Pig data loader allows users to read data from Phoenix backed HBase tables within a Pig script. 
 
 The Load func provides two alternative ways to load data.
- 1. Given a Table Name
-      A = load 'hbase://table/HIRES'  using org.apache.phoenix.pig.PhoenixHBaseLoader('localhost');
-	  
-	The above loads the data for all the columns in HIRES table.
-    To restrict the list of columns , you can specify the column names as part of LOAD as below
-       A = load 'hbase://table/HIRES/ID,NAME'  using org.apache.phoenix.pig.PhoenixHBaseLoader('localhost');
-     Here, only data for ID and NAME columns are returned.
+      
+	  1. Given a Table Name
+        A = load 'hbase://table/HIRES'  using org.apache.phoenix.pig.PhoenixHBaseLoader('localhost');
+			The above loads the data for all the columns in HIRES table.
+		
+		To restrict the list of columns , you can specify the column names as part of LOAD as below
+		A = load 'hbase://table/HIRES/ID,NAME'  using org.apache.phoenix.pig.PhoenixHBaseLoader('localhost');
+			Here, only data for ID and NAME columns are returned.
 
- 2. Given a Query	 
-      A = load 'hbase://query/SELECT ID,NAME FROM HIRES WHERE AGE > 50' using org.apache.phoenix.pig.PhoenixHBaseLoader('localhost');
-	The above query loads data of all those rows whose AGE column has a value > 50 . The LOAD func merely executes the given SQL query and returns the results. 
-	Though there is a provision to provide a query as part of LOAD, it is more restrictive to the following
-	a) Should be a SELECT query only.
-	b) Shouldn't contain any GROUP BY , ORDER BY , LIMIT , DISTINCT clauses within the query.
-	c) Shouldn't contain any of AGGREGATE functions.
+      2. Given a Query	 
+		A = load 'hbase://query/SELECT ID,NAME FROM HIRES WHERE AGE > 50' using org.apache.phoenix.pig.PhoenixHBaseLoader('localhost');
+			The above query loads data of all those rows whose AGE column has a value > 50 . The LOAD func merely executes the given SQL query and returns the results. 
+		
+		Though there is a provision to provide a query as part of LOAD, it is more restrictive to the following
+		a) Should be a SELECT query only.
+		b) Shouldn't contain any GROUP BY , ORDER BY , LIMIT , DISTINCT clauses within the query.
+		c) Shouldn't contain any of AGGREGATE functions.
 	
   In both the cases, the zookeeper quorum should be passed to the PhoenixHBaseLoader as an argument to the constructor.	
   
   The Loadfunc makes best effort to map Phoenix Data Types to Pig datatype. You can have a look at org.apache.phoenix.pig.util.TypeUtil to see how each of Phoenix data type is mapped to Pig data type.
   
-  TODOS:
-     With Phoenix 3.0 , we provide support for a ARRAY data type. However , this is not yet supported within Pig Loader.
-     Usage of String, Date functions within the provided SQL Query.
+  ####TODO
+     1. Support for ARRAY data type. 
+     2. Usage of String, Date functions within the provided SQL Query.
 	 
-  Example Usage:
-   Goal : Determine the number of users by a CLIENT ID.
-   Ddl: CREATE TABLE HIRES( CLIENTID INTEGER NOT NULL, EMPID INTEGER NOT NULL, NAME VARCHAR CONSTRAINT pk PRIMARY KEY(CLIENTID,EMPID));
-   Pig Script:
-   
-   raw = LOAD 'hbase://table/HIRES USING org.apache.phoenix.pig.PhoenixHBaseLoader('localhost')';
-   grpd = GROUP raw BY CLIENTID; 
-   cnt = FOREACH grpd GENERATE group AS CLIENT,COUNT(raw);
-   DUMP cnt;  
\ No newline at end of file
+  ####Example :
+  **Goal:** Determine the number of users by a CLIENT ID.
+  
+  **Ddl:** CREATE TABLE HIRES( CLIENTID INTEGER NOT NULL, EMPID INTEGER NOT NULL, NAME VARCHAR CONSTRAINT pk PRIMARY KEY(CLIENTID,EMPID));
+  
+  **Pig Script:** 
+  
+		raw = LOAD 'hbase://table/HIRES USING org.apache.phoenix.pig.PhoenixHBaseLoader('localhost')';
+		grpd = GROUP raw BY CLIENTID; 
+		cnt = FOREACH grpd GENERATE group AS CLIENT,COUNT(raw);
+		DUMP cnt;  
\ No newline at end of file