You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pig.apache.org by ol...@apache.org on 2010/12/16 19:10:59 UTC

svn commit: r1050082 [4/6] - in /pig/trunk: ./ src/docs/src/documentation/content/xdocs/

Added: pig/trunk/src/docs/src/documentation/content/xdocs/func.xml
URL: http://svn.apache.org/viewvc/pig/trunk/src/docs/src/documentation/content/xdocs/func.xml?rev=1050082&view=auto
==============================================================================
--- pig/trunk/src/docs/src/documentation/content/xdocs/func.xml (added)
+++ pig/trunk/src/docs/src/documentation/content/xdocs/func.xml Thu Dec 16 18:10:59 2010
@@ -0,0 +1,3193 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
+
+<document>
+  <header>
+    <title>Built In Functions</title>
+  </header>
+  <body>
+  
+<section>
+<title>About Built In Functions</title>
+<p>
+Pig comes with a set of built in functions (the eval, load/store, math, string, bag and tuple functions). Two main properties differentiate built in functions from <a href="udf.html">user defined  functions</a> (UDFs). First, built in functions don't need to be registered because Pig knows where they are. Second, built in functions don't need to be qualified when they are used because Pig knows where to find them. 
+</p>	
+</section>
+
+<!-- ================================================================== -->
+<!-- DYNAMIC INVOKERS -->
+<section>
+<title>About Dynamic Invokers</title>
+
+<p>Often you may need to use a simple function that is already provided by standard Java libraries, but for which a <a href="udf.html">user defined  functions</a> (UDF) has not been written. Dynamic invokers allow you to refer to Java functions without having to wrap them in custom UDFs, at the cost of doing some Java reflection on every function call. 
+</p>
+
+<source>
+...
+DEFINE UrlDecode InvokeForString('java.net.URLDecoder.decode', 'String String'); 
+encoded_strings = LOAD 'encoded_strings.txt' as (encoded:chararray); 
+decoded_strings = FOREACH encoded_strings GENERATE UrlDecode(encoded, 'UTF-8'); 
+...
+</source>
+
+<p>Currently, dynamic invokers can be used for any static function that: </p>
+<ul>
+<li>Accepts no arguments or accepts some combination of strings, ints, longs, doubles, floats, or arrays with these same types </li>
+<li>Returns a string, an int, a long, a double, or a float</li>
+</ul>
+<p>Only primitives can be used for numbers; no capital-letter numeric classes can be used as arguments. Depending on the return type, a specific kind of invoker must be used: InvokeForString, InvokeForInt, InvokeForLong, InvokeForDouble, or InvokeForFloat. </p>
+
+<p>The <a href="basic.html#define">DEFINE</a> statement is used to bind a keyword to a Java method, as above. The first argument to the InvokeFor* constructor is the full path to the desired method. The second argument is a space-delimited ordered list of the classes of the method arguments. This can be omitted or an empty string if the method takes no arguments. Valid class names are string, long, float, double, and int. Invokers can also work with array arguments, represented in Pig as DataBags of single-tuple elements. Simply refer to string[], for example. Class names are not case sensitive. </p>
+
+<p>The ability to use invokers on methods that take array arguments makes methods like those in org.apache.commons.math.stat.StatUtils available (for processing the results of grouping your datasets, for example). This is helpful, but a word of caution: the resulting UDF will not be optimized for Hadoop, and the very significant benefits one gains from implementing the Algebraic and Accumulative interfaces are lost here. Be careful if you use invokers this way.</p>
+</section>
+  
+<!-- ======================================================== -->  
+<!-- EVAL FUNCTIONS -->    
+<section>
+<title>Eval Functions</title>
+
+<section>
+<title>AVG</title>
+   <p>Computes the average of the numeric values in a single-column bag. </p>
+   <section>
+   <title>Syntax</title>
+   <table> 
+      <tr>
+            <td>
+               <p>AVG(expression)</p>
+            </td>
+         </tr> 
+   </table>
+   </section>
+   
+   <section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>Any expression whose result is a bag. The elements of the bag should be data type int, long, float, or double.</p>
+            </td>
+         </tr> 
+   </table>
+   </section>
+   
+   <section>
+   <title>Usage</title>
+   <p>Use the AVG function to compute the average of the numeric values in a single-column bag. 
+   AVG requires a preceding GROUP ALL statement for global averages and a GROUP BY statement for group averages.</p>
+   <p>The AVG function now ignores NULL values. </p>      
+   </section>
+   
+   <section>
+   <title>Example</title>
+   <p>In this example the average GPA for each student is computed (see the <a href="basic.html#GROUP">GROUP</a> operator for information about the field names in relation B).</p>
+<source>
+A = LOAD 'student.txt' AS (name:chararray, term:chararray, gpa:float);
+
+DUMP A;
+(John,fl,3.9F)
+(John,wt,3.7F)
+(John,sp,4.0F)
+(John,sm,3.8F)
+(Mary,fl,3.8F)
+(Mary,wt,3.9F)
+(Mary,sp,4.0F)
+(Mary,sm,4.0F)
+
+B = GROUP A BY name;
+
+DUMP B;
+(John,{(John,fl,3.9F),(John,wt,3.7F),(John,sp,4.0F),(John,sm,3.8F)})
+(Mary,{(Mary,fl,3.8F),(Mary,wt,3.9F),(Mary,sp,4.0F),(Mary,sm,4.0F)})
+
+C = FOREACH B GENERATE A.name, AVG(A.gpa);
+
+DUMP C;
+({(John),(John),(John),(John)},3.850000023841858)
+({(Mary),(Mary),(Mary),(Mary)},3.925000011920929)
+</source>
+   </section>
+   
+   <section>
+   <title>Types Tables</title>
+   <table>
+         <tr>
+            <td>
+               <p></p>
+            </td>
+            <td>
+               <p>int </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>float </p>
+            </td>
+            <td>
+               <p>double </p>
+            </td>
+            <td>
+               <p>chararray </p>
+            </td>
+            <td>
+               <p>bytearray </p>
+            </td>
+         </tr>
+         <tr>
+            <td>
+               <p>AVG </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>double </p>
+            </td>
+            <td>
+               <p>double </p>
+            </td>
+            <td>
+               <p>error </p>
+            </td>
+            <td>
+               <p>cast as double </p>
+            </td>
+         </tr> 
+   </table>
+   </section></section>
+   
+   <section>
+   <title>CONCAT</title>
+   <p>Concatenates two expressions of identical type.</p>
+   
+   <section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>CONCAT (expression, expression)</p>
+            </td>
+         </tr> 
+   </table>
+   </section>
+   
+   <section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>Any expression.</p>
+            </td>
+         </tr> 
+   </table>
+   </section>
+   
+   <section>
+   <title>Usage</title>
+   <p>Use the CONCAT function to concatenate two expressions. The result values of the two expressions must have identical types.</p>
+   </section>
+  </section>
+   
+   <section>
+   <title>Example</title>
+   <p>In this example fields f2 and f3 are concatenated.</p>
+<source>
+A = LOAD 'data' as (f1:chararray, f2:chararray, f3:chararray);
+
+DUMP A;
+(apache,open,source)
+(hadoop,map,reduce)
+(pig,pig,latin)
+
+X = FOREACH A GENERATE CONCAT(f2,f3);
+
+DUMP X;
+(opensource)
+(mapreduce)
+(piglatin)
+</source>
+</section>
+   
+   <section >
+   <title>COUNT</title>
+   <p>Computes the number of elements in a bag. </p>
+   <section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>COUNT(expression) </p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression with data type bag.</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Usage</title>
+   <p>Use the COUNT function to compute the number of elements in a bag.
+   COUNT requires a preceding GROUP ALL statement for global counts and a GROUP BY statement for group counts.</p>
+
+   <p>
+    The COUNT function follows syntax semantics and ignores nulls. 
+    What this means is that a tuple in the bag will not be counted if the FIRST FIELD in this tuple is NULL. 
+    If you want to include NULL values in the count computation, use 
+    <a href="#COUNT_STAR">COUNT_STAR</a>.
+   </p>   
+   
+   <p>
+    Note: You cannot use the tuple designator (*) with COUNT; that is, COUNT(*) will not work.   
+   </p>
+   </section>
+   
+   
+   <section>
+   <title>Example</title>
+   <p>In this example the tuples in the bag are counted (see the <a href="basic.html#GROUP">GROUP</a> operator for information about the field names in relation B).</p>
+<source>
+A = LOAD 'data' AS (f1:int,f2:int,f3:int);
+
+DUMP A;
+(1,2,3)
+(4,2,1)
+(8,3,4)
+(4,3,3)
+(7,2,5)
+(8,4,3)
+
+B = GROUP A BY f1;
+
+DUMP B;
+(1,{(1,2,3)})
+(4,{(4,2,1),(4,3,3)})
+(7,{(7,2,5)})
+(8,{(8,3,4),(8,4,3)})
+
+X = FOREACH B GENERATE COUNT(A);
+
+DUMP X;
+(1L)
+(2L)
+(1L)
+(2L)
+</source>
+   </section>
+   
+   <section>
+   <title>Types Tables</title>
+   <table>
+         <tr>
+            <td>
+               <p></p>
+            </td>
+            <td>
+               <p>int </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>float </p>
+            </td>
+            <td>
+               <p>double </p>
+            </td>
+            <td>
+               <p>chararray </p>
+            </td>
+            <td>
+               <p>bytearray </p>
+            </td>
+         </tr>
+         <tr>
+            <td>
+               <p>COUNT </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+         </tr> 
+   </table>
+   </section></section>
+   
+ <section>
+   <title>COUNT_STAR</title>
+   <p>Computes the number of elements in a bag. </p>
+   <section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>COUNT_STAR(expression)  </p>
+            </td>
+         </tr> 
+   </table>
+   </section>
+   
+   <section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression with data type bag.</p>
+            </td>
+         </tr> 
+   </table>
+   </section>
+   
+   <section>
+   <title>Usage</title>
+   <p>Use the COUNT_STAR function to compute the number of elements in a bag.
+   COUNT_STAR requires a preceding GROUP ALL statement for global counts and a GROUP BY statement for group counts.</p>
+   <p>COUNT_STAR includes NULL values in the count computation 
+   (unlike <a href="#COUNT">COUNT</a>, which ignores NULL values).
+   </p>
+   </section>
+   
+   <section>
+   <title>Example</title>
+   <p>In this example COUNT_STAR is used the count the tuples in a bag.</p>
+<source>
+X = FOREACH B GENERATE COUNT_STAR(A);
+</source>
+   </section>
+    </section>
+   
+   <section>
+   <title>DIFF</title>
+   <p>Compares two fields in a tuple.</p>
+   
+   <section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>DIFF (expression, expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression with any data type.</p>
+            </td>
+         </tr> 
+   </table>
+   </section>
+   
+   <section>
+   <title>Usage</title>
+   <p>The DIFF function takes two bags as arguments and compares them. 
+   Any tuples that are in one bag but not the other are returned in a bag. 
+   If the bags match, an empty bag is returned. If the fields are not bags 
+   then they will be wrapped in tuples and returned in a bag if they do not match, 
+   or an empty bag will be returned if the two records match. The implementation 
+   assumes that both bags being passed to the DIFF function will fit entirely 
+   into memory simultaneously. If this is not the case the UDF will still function 
+   but it will be VERY slow.</p>
+   </section>
+   
+   <section>
+   <title>Example</title>
+   <p>In this example DIFF compares the tuples in two bags.</p>
+<source>
+A = LOAD 'bag_data' AS (B1:bag{T1:tuple(t1:int,t2:int)},B2:bag{T2:tuple(f1:int,f2:int)});
+
+DUMP A;
+({(8,9),(0,1)},{(8,9),(1,1)})
+({(2,3),(4,5)},{(2,3),(4,5)})
+({(6,7),(3,7)},{(2,2),(3,7)})
+
+DESCRIBE A;
+a: {B1: {T1: (t1: int,t2: int)},B2: {T2: (f1: int,f2: int)}}
+
+X = FOREACH A DIFF(B1,B2);
+
+grunt> dump x;
+({(0,1),(1,1)})
+({})
+({(6,7),(2,2)})
+</source>
+   </section></section>
+   
+<section>
+   <title>IsEmpty</title>
+   <p>Checks if a bag or map is empty.</p>
+   
+   <section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>IsEmpty(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression with any data type.</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Usage</title>
+   <p>The IsEmpty function checks if a bag or map is empty (has no data). The function can be used to filter data.</p></section>
+   
+   <section>
+   <title>Example</title>
+   <p>In this example all students with an SSN but no name are located.</p>
+<source>
+SSN = load 'ssn.txt' using PigStorage() as (ssn:long);
+
+SSN_NAME = load 'students.txt' using PigStorage() as (ssn:long, name:chararray);
+
+-- do a left out join of SSN with SSN_Name
+X = cogroup SSN by ssn inner, SSN_NAME by ssn;
+
+-- only keep those ssn's for which there is no name
+Y = filter X by IsEmpty(SSN_NAME);
+</source>
+   </section></section>    
+   
+   
+   <section>
+   <title>MAX</title>
+   <p>Computes the maximum of the numeric values or chararrays in a single-column bag. MAX requires a preceding GROUP ALL statement for global maximums and a GROUP BY statement for group maximums.</p>
+   
+   <section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>MAX(expression)        </p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression with data types int, long, float, double, or chararray.</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Usage</title>
+   <p>Use the MAX function to compute the maximum of the numeric values or chararrays in a single-column bag.</p></section>
+   
+   <section>
+   <title>Example</title>
+   <p>In this example the maximum GPA for all terms is computed for each student (see the GROUP operator for information about the field names in relation B).</p>
+<source>
+A = LOAD 'student' AS (name:chararray, session:chararray, gpa:float);
+
+DUMP A;
+(John,fl,3.9F)
+(John,wt,3.7F)
+(John,sp,4.0F)
+(John,sm,3.8F)
+(Mary,fl,3.8F)
+(Mary,wt,3.9F)
+(Mary,sp,4.0F)
+(Mary,sm,4.0F)
+
+B = GROUP A BY name;
+
+DUMP B;
+(John,{(John,fl,3.9F),(John,wt,3.7F),(John,sp,4.0F),(John,sm,3.8F)})
+(Mary,{(Mary,fl,3.8F),(Mary,wt,3.9F),(Mary,sp,4.0F),(Mary,sm,4.0F)})
+
+X = FOREACH B GENERATE group, MAX(A.gpa);
+
+DUMP X;
+(John,4.0F)
+(Mary,4.0F)
+</source>
+   </section>
+   
+   <section>
+   <title>Types Tables</title>
+   <table>
+         <tr>
+            <td>
+               <p></p>
+            </td>
+            <td>
+               <p>int </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>float </p>
+            </td>
+            <td>
+               <p>double </p>
+            </td>
+            <td>
+               <p>chararray </p>
+            </td>
+            <td>
+               <p>bytearray </p>
+            </td>
+         </tr>
+         <tr>
+            <td>
+               <p>MAX </p>
+            </td>
+            <td>
+               <p>int </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>float </p>
+            </td>
+            <td>
+               <p>double </p>
+            </td>
+            <td>
+               <p>chararray </p>
+            </td>
+            <td>
+               <p>cast as double</p>
+            </td>
+         </tr> 
+   </table>
+   </section></section>
+   
+   <section>
+   <title>MIN</title>
+   <p>Computes the minimum of the numeric values or chararrays in a single-column bag. MIN requires a preceding GROUP… ALL statement for global minimums and a GROUP … BY statement for group minimums.</p>
+   
+   <section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>MIN(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression with data types int, long, float, double, or chararray.</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   
+   <title>Usage</title>
+   <p>Use the MIN function to compute the minimum of a set of numeric values or chararrays in a single-column bag.</p></section>
+   
+   <section>
+   <title>Example</title>
+   <p>In this example the minimum GPA for all terms is computed for each student (see the GROUP operator for information about the field names in relation B).</p>
+<source>
+A = LOAD 'student' AS (name:chararray, session:chararray, gpa:float);
+
+DUMP A;
+(John,fl,3.9F)
+(John,wt,3.7F)
+(John,sp,4.0F)
+(John,sm,3.8F)
+(Mary,fl,3.8F)
+(Mary,wt,3.9F)
+(Mary,sp,4.0F)
+(Mary,sm,4.0F)
+
+B = GROUP A BY name;
+
+DUMP B;
+(John,{(John,fl,3.9F),(John,wt,3.7F),(John,sp,4.0F),(John,sm,3.8F)})
+(Mary,{(Mary,fl,3.8F),(Mary,wt,3.9F),(Mary,sp,4.0F),(Mary,sm,4.0F)})
+
+X = FOREACH B GENERATE group, MIN(A.gpa);
+
+DUMP X;
+(John,3.7F)
+(Mary,3.8F)
+</source>
+   </section>
+   
+   <section>
+   <title>Types Tables</title>
+   <table>
+         <tr>
+            <td>
+               <p></p>
+            </td>
+            <td>
+               <p>int </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>float </p>
+            </td>
+            <td>
+               <p>double </p>
+            </td>
+            <td>
+               <p>chararray </p>
+            </td>
+            <td>
+               <p>bytearray </p>
+            </td>
+         </tr>
+         <tr>
+            <td>
+               <p>MIN </p>
+            </td>
+            <td>
+               <p>int </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>float </p>
+            </td>
+            <td>
+               <p>double </p>
+            </td>
+            <td>
+               <p>chararray </p>
+            </td>
+            <td>
+               <p>cast as double</p>
+            </td>
+         </tr> 
+   </table>
+   </section></section>
+   
+   <section>
+   <title>SIZE</title>
+   <p>Computes the number of elements based on any Pig data type. </p>
+   
+   <section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>SIZE(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression with any data type.</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Usage</title>
+   <p>Use the SIZE function to compute the number of elements based on the data type (see the Types Tables below). 
+   SIZE includes NULL values in the size computation. SIZE is not algebraic.</p>
+   </section>
+   
+   <section>
+   <title>Example</title>
+   <p>In this example the number of characters in the first field is computed.</p>
+<source>
+A = LOAD 'data' as (f1:chararray, f2:chararray, f3:chararray);
+(apache,open,source)
+(hadoop,map,reduce)
+(pig,pig,latin)
+
+X = FOREACH A GENERATE SIZE(f1);
+
+DUMP X;
+(6L)
+(6L)
+(3L)
+</source>
+   </section>
+   
+   <section>
+   <title>Types Tables</title>
+   <table>
+       <tr>
+            <td>
+               <p>int </p>
+            </td>
+            <td>
+               <p>returns 1 </p>
+            </td>
+         </tr>
+         <tr>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>returns 1 </p>
+            </td>
+         </tr>
+         <tr>
+            <td>
+               <p>float </p>
+            </td>
+            <td>
+               <p>returns 1 </p>
+            </td>
+         </tr>
+         <tr>
+            <td>
+               <p>double </p>
+            </td>
+            <td>
+               <p>returns 1 </p>
+            </td>
+         </tr>
+         <tr>
+            <td>
+               <p>chararray </p>
+            </td>
+            <td>
+               <p>returns number of characters in the array </p>
+            </td>
+         </tr>
+         <tr>
+            <td>
+               <p>bytearray </p>
+            </td>
+            <td>
+               <p>returns number of bytes in the array </p>
+            </td>
+         </tr>
+         <tr>
+            <td>
+               <p>tuple </p>
+            </td>
+            <td>
+               <p>returns number of fields in the tuple</p>
+            </td>
+         </tr>
+         <tr>
+            <td>
+               <p>bag </p>
+            </td>
+            <td>
+               <p>returns number of tuples in bag </p>
+            </td>
+         </tr>
+         <tr>
+            <td>
+               <p>map </p>
+            </td>
+            <td>
+               <p>returns number of key/value pairs in map </p>
+            </td>
+         </tr> 
+   </table></section></section>
+   
+   <section>
+   <title>SUM</title>
+   <p>Computes the sum of the numeric values in a single-column bag. SUM requires a preceding GROUP ALL statement for global sums and a GROUP BY statement for group sums.</p>
+   
+   <section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>SUM(expression)        </p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression with data types int, long, float, double, or bytearray cast as double.</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Usage</title>
+   <p>Use the SUM function to compute the sum of a set of numeric values in a single-column bag.</p></section>
+   
+   <section>
+   <title>Example</title>
+   <p>In this example the number of pets is computed. (see the GROUP operator for information about the field names in relation B).</p>
+<source>
+A = LOAD 'data' AS (owner:chararray, pet_type:chararray, pet_num:int);
+
+DUMP A;
+(Alice,turtle,1)
+(Alice,goldfish,5)
+(Alice,cat,2)
+(Bob,dog,2)
+(Bob,cat,2) 
+
+B = GROUP A BY owner;
+
+DUMP B;
+(Alice,{(Alice,turtle,1),(Alice,goldfish,5),(Alice,cat,2)})
+(Bob,{(Bob,dog,2),(Bob,cat,2)})
+
+X = FOREACH B GENERATE group, SUM(A.pet_num);
+DUMP X;
+(Alice,8L)
+(Bob,4L)
+</source>
+   </section>
+   
+   <section>
+   <title>Types Tables</title>
+   <table>
+         <tr>
+            <td>
+               <p></p>
+            </td>
+            <td>
+               <p>int </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>float </p>
+            </td>
+            <td>
+               <p>double </p>
+            </td>
+            <td>
+               <p>chararray </p>
+            </td>
+            <td>
+               <p>bytearray </p>
+            </td>
+         </tr>
+         <tr>
+            <td>
+               <p>SUM </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>long </p>
+            </td>
+            <td>
+               <p>double </p>
+            </td>
+            <td>
+               <p>double </p>
+            </td>
+            <td>
+               <p>error </p>
+            </td>
+            <td>
+               <p>cast as double </p>
+            </td>
+         </tr> 
+   </table>
+   </section></section>
+   
+   <section>
+   <title>TOKENIZE</title>
+   <p>Splits a string and outputs a bag of words. </p>
+   
+   <section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>TOKENIZE(expression)        </p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression with data type chararray.</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Usage</title>
+   <p>Use the TOKENIZE function to split a string of words (all words in a single tuple) into a bag of words (each word in a single tuple). The following characters are considered to be word separators: space, double quote("), coma(,) parenthesis(()), star(*).</p>
+   </section>
+   
+   <section>
+   <title>Example</title>
+   <p>In this example the strings in each row are split.</p>
+<source>
+A  = LOAD 'data' AS (f1:chararray);
+
+DUMP A;
+(Here is the first string.)
+(Here is the second string.)
+(Here is the third string.)
+
+X = FOREACH A GENERATE TOKENIZE(f1);
+
+DUMP X;
+({(Here),(is),(the),(first),(string.)})
+({(Here),(is),(the),(second),(string.)})
+({(Here),(is),(the),(third),(string.)})
+</source>
+   
+   </section></section></section>
+   
+   <section>
+   <title>Load/Store Functions</title>
+   <p>Load/store functions determine how data goes into Pig and comes out of Pig. 
+   Pig provides a set of built-in load/store functions, described in the sections below. 
+   You can also write your own load/store functions  (see <a href="udf.html">User Defined Functions</a>).</p>
+   
+
+   <section>
+   <title>Handling Compression</title>
+
+<p>Support for compression is determined by the load/store function. PigStorage and TextLoader support gzip and bzip compression for both read (load) and write (store). BinStorage does not support compression.</p>
+
+<p>To work with gzip compressed files, input/output files need to have a .gz extension. Gzipped files cannot be split across multiple maps; this means that the number of maps created is equal to the number of part files in the input location.</p>
+
+<source>
+A = load ‘myinput.gz’;
+store A into ‘myoutput.gz’; 
+</source>
+
+<p>To work with bzip compressed files, the input/output files need to have a .bz or .bz2 extension. Because the compression is block-oriented, bzipped files can be split across multiple maps.</p>
+
+<source>
+A = load ‘myinput.bz’;
+store A into ‘myoutput.bz’; 
+</source>
+
+<p>Note: PigStorage and TextLoader correctly read compressed files as long as they are NOT CONCATENATED FILES generated in this manner: </p>
+  <ul>
+      <li>
+         <p>cat *.gz > text/concat.gz</p>
+      </li>
+      <li>
+         <p>cat *.bz > text/concat.bz </p>
+      </li>
+      <li>
+         <p>cat *.bz2 > text/concat.bz2</p>
+      </li>
+   </ul>
+<p></p>
+<p>If you use concatenated gzip or bzip files with your Pig jobs, you will NOT see a failure but the results will be INCORRECT.</p>
+<p></p>
+
+</section>
+
+   <section>
+   <title>BinStorage</title>
+   <p>Loads and stores data in machine-readable format.</p>
+   
+   <section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>BinStorage()        </p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>none</p>
+            </td>
+            <td>
+               <p>no parameters</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Usage</title>
+   <p>BinStorage works with data that is represented on disk in machine-readable format. 
+   BinStorage does NOT support <a href="#Handling+Compression">compression</a>.</p>
+   
+    <p>BinStorage is used internally by Pig to store the temporary data that is created between multiple map/reduce jobs.</p>
+      
+    <p>BinStorage supports multiple locations (files, directories, globs) as input.</p>
+    
+      </section>
+   
+   <section>
+   <title>Example</title>
+   <p>In this example BinStorage is used with the LOAD and STORE functions.</p>
+<source>
+A = LOAD 'data' USING BinStorage();
+
+STORE X into 'output' USING BinStorage(); 
+</source>
+
+   <p>In this example BinStorage is used to load multiple locations.</p>
+<source>
+A = LOAD 'input1.bin, input2.bin' USING BinStorage();
+</source>
+
+<p>BinStorage does not track data lineage. When Pig uses BinStorage to move data between MapReduce jobs, Pig can figure out the correct cast function to use and apply it. However, as shown in the example below, when you store data using BinStorage and then use a separate Pig Latin script to read data (thus loosing the type information), it is your responsibility to correctly cast the data before storing it using BinStorage.
+ </p>
+
+<source>
+raw = load 'sampledata' using BinStorage() as (col1,col2, col3);
+--filter out null columns
+A = filter raw by col1#'bcookie' is not null;
+
+B = foreach A generate col1#'bcookie'  as reqcolumn;
+describe B;
+--B: {regcolumn: bytearray}
+X = limit B 5;
+dump X;
+(36co9b55onr8s)
+(36co9b55onr8s)
+(36hilul5oo1q1)
+(36hilul5oo1q1)
+(36l4cj15ooa8a)
+
+B = foreach A generate (chararray)col1#'bcookie'  as convertedcol;
+describe B;
+--B: {convertedcol: chararray}
+X = limit B 5;
+dump X; 
+()
+()
+()
+()
+()
+</source>
+
+   </section></section>
+   
+   <section>
+   <title>PigStorage</title>
+   <p>Loads and stores data in UTF-8 format.</p>
+   
+   <section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>PigStorage(field_delimiter) </p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>field_delimiter</p>
+            </td>
+            <td>
+               <p>Parameter. </p>
+               <p>The default field delimiter is tab ('\t'). </p>
+               <p>You can specify other characters as field delimiters; however, be sure to encase the characters in single quotes.</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Usage</title>
+   <p>PigStorage is the default function for the LOAD and STORE operators and works with both simple and complex data types. </p>
+   
+   <p>PigStorage supports structured text files (in human-readable UTF-8 format). PigStorage also supports <a href="#Handling+Compression">compression</a>.</p>
+   
+    <p>PigStorage supports multiple locations (files, directories, globs) as input.</p>
+
+  <p>Load statements – PigStorage expects data to be formatted using field delimiters, either the tab character  ('\t') or other specified character.</p>
+
+   <p>Store statements – PigStorage outputs data using field deliminters, either the tab character  ('\t') or other specified character, and the line feed record delimiter ('\n').  </p>
+
+   <p>Field Delimiters – For load and store statements the default field delimiter is the tab character ('\t'). You can use other characters as field delimiters, but separators such as ^A or Ctrl-A should be represented in Unicode (\u0001) using UTF-16 encoding (see Wikipedia <a href="http://en.wikipedia.org/wiki/ASCII">ASCII</a>, <a href="http://en.wikipedia.org/wiki/Unicode">Unicode</a>, and <a href="http://en.wikipedia.org/wiki/UTF-16">UTF-16</a>).</p>
+   
+   <p>Record Deliminters – For load statements Pig interprets the line feed ( '\n' ), carriage return ( '\r' or CTRL-M) and combined CR + LF ( '\r\n' ) characters as record delimiters (do not use these characters as field delimiters). For store statements Pig uses the line feed ('\n') character as the record delimiter.</p>
+   </section>
+   
+   <section>
+   <title>Example</title>
+   <p>In this example PigStorage expects input.txt to contain tab-separated fields and newline-separated records. The statements are equivalent.</p>
+<source>
+A = LOAD 'student' USING PigStorage('\t') AS (name: chararray, age:int, gpa: float); 
+
+A = LOAD 'student' AS (name: chararray, age:int, gpa: float);
+</source>
+   
+   <p>In this example PigStorage stores the contents of X into files with fields that are delimited with an asterisk ( * ). The STORE function specifies that the files will be located in a directory named output and that the files will be named part-nnnnn (for example, part-00000).</p>
+<source>
+STORE X INTO  'output' USING PigStorage('*');
+</source>
+   </section></section>
+   
+   <section>
+   <title>PigDump</title>
+   <p>Stores data in UTF-8 format.</p>
+   
+   <section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>PigDump()        </p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>none</p>
+            </td>
+            <td>
+               <p>no parameters</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Usage</title>
+   <p>PigDump stores data as tuples in human-readable UTF-8 format. </p></section>
+   
+   <section>
+   <title>Example</title>
+   <p>In this example PigDump is used with the STORE function.</p>
+<source>
+STORE X INTO 'output' USING PigDump();
+</source>
+   </section></section>
+   
+   <section>
+   <title>TextLoader</title>
+   <p>Loads unstructured data in UTF-8 format.</p>
+   
+   <section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>TextLoader()</p>
+            </td>
+         </tr> 
+   </table>
+   </section>
+   
+   <section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>none</p>
+            </td>
+            <td>
+               <p>no parameters</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+   <section>
+   <title>Usage</title>
+   <p>TextLoader works with unstructured data in UTF8 format. Each resulting tuple contains a single field with one line of input text. TextLoader also supports <a href="#Handling+Compression">compression</a>.</p>
+   <p>Currently, TextLoader support for compression is limited.</p>  
+   <p>TextLoader cannot be used to store data.</p>
+   </section>
+   
+   <section>
+   <title>Example</title>
+   <p>In this example TextLoader is used with the LOAD function.</p>
+<source>
+A = LOAD 'data' USING TextLoader();
+</source>
+   </section></section></section>
+   
+
+<!-- ======================================================== -->  
+<!-- ======================================================== -->  
+<!-- Math Functions -->
+<section>
+<title>Math Functions</title>
+
+<p>For general information about these functions, see the <a href="http://download.oracle.com/javase/6/docs/api/">Java API Specification</a>, 
+<a href="http://download.oracle.com/javase/6/docs/api/java/lang/Math.html">Class Math</a>. Note the following:</p>
+
+<ul>
+		<li>
+<p>Pig function names are case sensitive and UPPER CASE.</p>
+	</li>
+	<li>
+<p>Pig may process results differently than as stated in the Java API Specification:</p>
+<ul>
+	<li>
+<p>If the result value is null or empty, Pig returns null.</p>
+	</li>
+		<li>
+<p>If the result value is not a number (NaN), Pig returns null.</p>
+	</li>
+		<li>
+<p>If Pig is unable to process the expression, Pig returns an exception.</p>
+	</li>
+</ul> 
+	</li>
+</ul> 
+ 
+<section>
+   <title>ABS</title>
+   <p>Returns the absolute value of an expression.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>ABS(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>Any expression whose result is type int, long, float, or double.</p>
+            </td>
+         </tr>
+          
+   </table></section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+	Use the ABS function to return the absolute value of an expression. 
+    If the result is not negative (x &#8805; 0), the result is returned. If the result is negative (x &lt; 0), the negation of the result is returned.
+    </p>
+
+</section>
+   
+</section>
+
+<!-- ======================================================== --> 
+    
+<section>
+   <title>ACOS</title>
+   <p>Returns the arc cosine of an expression.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>ACOS(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result is type double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the ACOS function to return the arc cosine of an expression.
+  </p>
+   </section>
+   
+</section>    
+   
+  <!-- ======================================================== -->     
+    <section>
+   <title>ASIN</title>
+   <p>Returns the arc sine of an expression.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>ASIN(expression)</p>
+            </td>
+         </tr>
+        
+   </table>
+ </section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result is type double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the ASIN function to return the arc sine of an expression.
+     </p>
+   </section>
+</section>
+   
+  <!-- ======================================================== -->  
+  
+ <section>
+   <title>ATAN</title>
+   <p>Returns the arc tangent of an expression.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>ATAN(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result is type double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the ATAN function to return the arc tangent of an expression.
+     </p>
+   </section>
+   
+</section>  
+
+  <!-- ======================================================== -->  
+  
+ <section>
+   <title>CBRT</title>
+   <p>Returns the cube root of an expression.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>CBRT(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result is type double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the CBRT function to return the cube root of an expression. 
+</p>
+   </section>
+
+</section>  
+
+ <!-- ======================================================== -->  
+  
+ <section>
+   <title>CEIL</title>
+   <p>Returns the value of an expression rounded up to the nearest integer.
+</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>CEIL(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result is type double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the CEIL function to return the value of an expression rounded up to the nearest integer. 
+This function never decreases the result value.
+     </p>
+        <table>
+       <tr>
+            <td>
+               <p>x</p>
+            </td>
+            <td>
+               <p>CEIL(x)</p>
+            </td>
+         </tr>
+        
+              <tr>
+            <td>
+               <p> 4.6</p>
+            </td>
+            <td>
+               <p> 5</p>
+            </td>
+         </tr>
+        
+        <tr>
+            <td>
+               <p> 3.5</p>
+            </td>
+            <td>
+               <p> 4</p>
+            </td>
+         </tr>
+        
+         <tr>
+            <td>
+               <p> 2.4</p>
+            </td>
+            <td>
+               <p> 3</p>
+            </td>
+         </tr>
+        
+              <tr>
+            <td>
+               <p>1.0</p>
+            </td>
+            <td>
+               <p>1</p>
+            </td>
+         </tr>
+        
+              <tr>
+            <td>
+               <p>-1.0</p>
+            </td>
+            <td>
+               <p>-1</p>
+            </td>
+         </tr>
+        
+                <tr>
+            <td>
+               <p>-2.4</p>
+            </td>
+            <td>
+               <p>-2</p>
+            </td>
+         </tr>
+        
+         <tr>
+            <td>
+               <p>-3.5</p>
+            </td>
+            <td>
+               <p>-3</p>
+            </td>
+         </tr>
+        
+                <tr>
+            <td>
+               <p>-4.6</p>
+            </td>
+            <td>
+               <p>-4</p>
+            </td>
+         </tr>
+        
+   </table>
+
+   </section>
+</section>    
+  
+<!-- ======================================================== -->  
+  
+ <section>
+   <title>COSH</title>
+   <p>Returns the hyperbolic cosine of an expression.
+</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>COSH(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result is type double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the COSH function to return the hyperbolic cosine of an expression. 
+     </p>
+   </section>
+</section>    
+  
+  <!-- ======================================================== -->  
+  
+ <section>
+   <title>COS</title>
+   <p>Returns the trigonometric cosine of an expression.
+</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>COS(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression (angle) whose result is type double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the COS function to return the trigonometric cosine of an expression.
+     </p>
+   </section>
+   
+</section>    
+
+<!-- ======================================================== -->  
+  
+ <section>
+   <title>EXP</title>
+   <p>Returns Euler's number e raised to the power of x.
+</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>EXP(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result is type double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the EXP function to return the value of Euler's number e raised to the power of x (where x is the result value of the expression).
+     </p>
+   </section>
+</section>    
+  
+<!-- ======================================================== -->  
+  
+ <section>
+   <title>FLOOR</title>
+   <p>Returns the value of an expression rounded down to the nearest integer. 
+</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>FLOOR(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result is type double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the FLOOR function to return the value of an expression rounded down to the nearest integer. 
+This function never increases the result value.
+     </p>
+     
+     
+        <table>
+       <tr>
+            <td>
+               <p>x</p>
+            </td>
+            <td>
+               <p>CEIL(x)</p>
+            </td>
+         </tr>
+        
+              <tr>
+            <td>
+               <p> 4.6</p>
+            </td>
+            <td>
+               <p> 4</p>
+            </td>
+         </tr>
+        
+        <tr>
+            <td>
+               <p> 3.5</p>
+            </td>
+            <td>
+               <p> 3</p>
+            </td>
+         </tr>
+        
+         <tr>
+            <td>
+               <p> 2.4</p>
+            </td>
+            <td>
+               <p> 2</p>
+            </td>
+         </tr>
+        
+              <tr>
+            <td>
+               <p>1.0</p>
+            </td>
+            <td>
+               <p>1</p>
+            </td>
+         </tr>
+        
+              <tr>
+            <td>
+               <p>-1.0</p>
+            </td>
+            <td>
+               <p>-1</p>
+            </td>
+         </tr>
+        
+                <tr>
+            <td>
+               <p>-2.4</p>
+            </td>
+            <td>
+               <p>-3</p>
+            </td>
+         </tr>
+        
+         <tr>
+            <td>
+               <p>-3.5</p>
+            </td>
+            <td>
+               <p>-4</p>
+            </td>
+         </tr>
+        
+                <tr>
+            <td>
+               <p>-4.6</p>
+            </td>
+            <td>
+               <p>-5</p>
+            </td>
+         </tr>
+        
+   </table>
+   </section>
+</section>      
+<!-- ======================================================== -->  
+  
+ <section>
+   <title>LOG</title>
+   <p>Returns the natural logarithm (base e) of an expression.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>LOG(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result is type double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the LOG function to return the natural logarithm (base e) of an expression.
+     </p>
+   </section>
+</section>     
+  
+  <!-- ======================================================== -->  
+  
+ <section>
+   <title>LOG10</title>
+   <p>Returns the base 10 logarithm of an expression.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>LOG10(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result is type double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the LOG10 function to return the base 10 logarithm of an expression.
+     </p>
+   </section>
+</section>     
+
+  <!-- ======================================================== -->  
+  
+ <section>
+   <title>RANDOM</title>
+   <p>Returns a pseudo random number.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>RANDOM( )</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>N/A</p>
+            </td>
+            <td>
+               <p>No terms.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the RANDOM function to return a pseudo random number (type double) greater than or equal to 0.0 and less than 1.0.
+     </p>  
+   </section>
+</section>     
+  
+<!-- ======================================================== -->  
+  
+ <section>
+   <title>ROUND</title>
+   <p>Returns the value of an expression rounded to an integer.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>ROUND(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result is type float or double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the ROUND function to return the value of an expression rounded to an integer (if the result type is float) or rounded to a long (if the result type is double).
+     </p>
+        <table>
+       <tr>
+            <td>
+               <p>x</p>
+            </td>
+            <td>
+               <p>CEIL(x)</p>
+            </td>
+         </tr>
+        
+              <tr>
+            <td>
+               <p> 4.6</p>
+            </td>
+            <td>
+               <p> 5</p>
+            </td>
+         </tr>
+        
+        <tr>
+            <td>
+               <p> 3.5</p>
+            </td>
+            <td>
+               <p> 4</p>
+            </td>
+         </tr>
+        
+         <tr>
+            <td>
+               <p> 2.4</p>
+            </td>
+            <td>
+               <p> 2</p>
+            </td>
+         </tr>
+        
+              <tr>
+            <td>
+               <p>1.0</p>
+            </td>
+            <td>
+               <p>1</p>
+            </td>
+         </tr>
+        
+              <tr>
+            <td>
+               <p>-1.0</p>
+            </td>
+            <td>
+               <p>-1</p>
+            </td>
+         </tr>
+        
+                <tr>
+            <td>
+               <p>-2.4</p>
+            </td>
+            <td>
+               <p>-2</p>
+            </td>
+         </tr>
+        
+         <tr>
+            <td>
+               <p>-3.5</p>
+            </td>
+            <td>
+               <p>-3</p>
+            </td>
+         </tr>
+        
+                <tr>
+            <td>
+               <p>-4.6</p>
+            </td>
+            <td>
+               <p>-5</p>
+            </td>
+         </tr>
+        
+   </table>
+   </section>
+</section>       
+  
+<!-- ======================================================== -->  
+ <section>
+   <title>SIN</title>
+   <p>Returns the sine of an expression.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>SIN(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result is double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the SIN function to return the sine of an expession. 
+     </p>
+   </section>
+</section>       
+  
+<!-- ======================================================== -->  
+ <section>
+   <title>SINH</title>
+   <p>Returns the hyperbolic sine of an expression.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>SINH(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result is double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the SINH function to return the hyperbolic sine of an expression. 
+     </p>
+   </section>
+</section>
+
+
+<!-- ======================================================== -->  
+ <section>
+   <title>SQRT</title>
+   <p>Returns the positive square root of an expression.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>SQRT(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result is double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the SQRT function to return the positive square root of an expression. 
+     </p>
+   </section>
+</section>
+
+<!-- ======================================================== -->  
+ <section>
+   <title>TAN</title>
+   <p>Returns the trignometric tangent of an angle.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>TAN(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression (angle) whose result is double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the TAN function to return the trignometric tangent of an angle.
+     </p>
+   </section>
+
+</section>
+
+<!-- ======================================================== -->  
+ <section>
+   <title>TANH</title>
+   <p>Returns the hyperbolic tangent of an expression. </p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>TANH(expression)</p>
+            </td>
+         </tr> 
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result is double.</p>
+            </td>
+         </tr>
+        
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the TANH function to return the hyperbolic tangent of an expression. 
+     </p>
+   </section>
+</section>
+</section>
+<!-- End Math Functions --> 
+
+
+<!-- ======================================================== -->
+<!-- ======================================================== -->   
+
+<!-- String Functions -->
+<section>
+<title>String Functions</title>
+
+<p>For general information about these functions, see the <a href="http://download.oracle.com/javase/6/docs/api/">Java API Specification</a>, 
+<a href="http://download.oracle.com/javase/6/docs/api/java/lang/String.html">Class String</a>. Note the following:</p>
+
+<ul>
+	<li>
+<p>Pig function names are case sensitive and UPPER CASE.</p>
+	</li>
+		<li>
+<p>Pig string functions have an extra, first parameter: the string to which all the operations are applied.</p>
+	</li>
+		<li>
+<p>Pig may process results differently than as stated in the Java API Specification. If any of the input parameters are null or if an insufficient number of parameters are supplied, NULL is returned.</p>
+	</li>
+
+</ul>
+ 
+ <section>
+   <title>INDEXOF</title>
+   <p>Returns the index of the first occurrence of a character in a string, searching forward from a start index. </p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>INDEXOF(string, 'character', startIndex)</p>
+            </td>
+         </tr>
+   </table>
+ </section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>string</p>
+            </td>
+            <td>
+               <p>The string to be searched.</p>
+            </td>
+         </tr> 
+                <tr>
+            <td>
+               <p>'character'</p>
+            </td>
+            <td>
+               <p>The character being searched for, in quotes. </p>
+            </td>
+         </tr> 
+                <tr>
+            <td>
+               <p>startIndex</p>
+            </td>
+            <td>
+               <p>The index from which to begin the forward search. </p>
+               <p>The string index begins with zero (0).</p>
+            </td>
+         </tr> 
+   </table>
+</section>
+
+<section>
+     <title>Usage</title>
+     <p>
+Use the INDEXOF function to determine the index of the first occurrence of a character in a string. The forward search for the character begins at the designated start index.
+     </p>
+
+</section>
+</section> 
+
+<!-- ======================================================== -->  
+ <section>
+   <title>LAST_INDEX_OF</title>
+   <p>Returns the index of the last occurrence of a character in a string, searching backward from a start index. </p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>LAST_INDEX_OF(expression)</p>
+            </td>
+         </tr>
+   </table>
+   </section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>string</p>
+            </td>
+            <td>
+               <p>The string to be searched.</p>
+            </td>
+         </tr> 
+                <tr>
+            <td>
+               <p>'character'</p>
+            </td>
+            <td>
+               <p>The character being searched for, in quotes.</p>
+            </td>
+         </tr> 
+                <tr>
+            <td>
+               <p>startIndex</p>
+            </td>
+            <td>
+               <p>The index from which to begin the backward search.</p>
+               <p>The string index begins with zero (0).</p>
+            </td>
+         </tr> 
+   </table>
+</section>
+
+<section>
+     <title>Usage</title>
+     <p>
+Use the LAST_INDEX_OF function to determine the index of the last occurrence of a character in a string. The backward search for the character begins at the designated start index.
+     </p>
+</section>
+</section> 
+
+
+<!-- ======================================================== -->  
+ <section>
+   <title>LCFIRST</title>
+   <p>Converts the first character in a string to lower case. </p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>LCFIRST(expression)</p>
+            </td>
+         </tr>
+   </table>
+ </section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result type is chararray.</p>
+            </td>
+         </tr> 
+   </table>
+</section>
+
+<section>
+     <title>Usage</title>
+     <p>
+Use the LCFIRST function to convert only the first character in a string to lower case. 
+     </p>
+</section>
+</section> 
+
+<!-- ======================================================== -->  
+ <section>
+   <title>LOWER</title>
+   <p>Converts all characters in a string to lower case. </p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>LOWER(expression)</p>
+            </td>
+         </tr>
+   </table>
+</section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result type is chararray.</p>
+            </td>
+         </tr> 
+   </table>
+</section>
+
+<section>
+     <title>Usage</title>
+     <p>
+Use the LOWER function to convert all characters in a string to lower case. 
+     </p>
+</section>
+</section> 
+
+
+<!-- ======================================================== -->
+ <section>
+   <title>REGEX_EXTRACT </title>
+   <p>Performs regular expression matching and extracts the matched group defined by an index parameter. </p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>REGEX_EXTRACT (string, regex, index)</p>
+            </td>
+         </tr>
+   </table>
+ </section>
+
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>string</p>
+            </td>
+            <td>
+               <p>The string in which to perform the match.</p>
+            </td>
+         </tr> 
+        <tr>
+            <td>
+               <p>regex</p>
+            </td>
+            <td>
+               <p>The regular expression.</p>
+            </td>
+         </tr> 
+         
+                <tr>
+            <td>
+               <p>index</p>
+            </td>
+            <td>
+               <p>The index of the matched group to return.</p>
+            </td>
+         </tr> 
+   </table>
+</section>
+
+<section>
+     <title>Usage</title>
+     <p>
+Use the REGEX_EXTRACT function to perform regular expression matching and to extract the matched group defined by the index parameter (where the index is a 1-based parameter.) The function uses Java regular expression form.
+     </p>
+     <p>
+The function returns a string that corresponds to the matched group in the position specified by the index. If there is no matched expression at that position, NULL is returned.
+     </p>
+ </section>
+ 
+ <section>
+     <title>Example</title>
+     <p>
+This example will return the string '192.168.1.5'.
+     </p>
+ <source>
+REGEX_EXTRACT('192.168.1.5:8020', '(.*)\:(.*)', 1);
+</source>
+     
+ </section>
+
+</section>
+
+<!-- ======================================================== -->
+ <section>
+   <title>REGEX_EXTRACT_ALL </title>
+   <p>Performs regular expression matching and extracts all matched groups.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>REGEX_EXTRACT (string, regex)</p>
+            </td>
+         </tr>
+   </table>
+ </section>
+
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>string</p>
+            </td>
+            <td>
+               <p>The string in which to perform the match.</p>
+            </td>
+         </tr> 
+         
+                <tr>
+            <td>
+               <p>regex</p>
+            </td>
+            <td>
+               <p>The regular expression.</p>
+            </td>
+         </tr> 
+   </table>
+</section>
+
+<section>
+     <title>Usage</title>
+     <p>
+Use the REGEX_EXTRACT_ALL function to perform regular expression matching and to extract all matched groups. The function uses Java regular expression form.
+     </p>
+     <p>
+The function returns a tuple where each field represents a matched expression. If there is no match, an empty tuple is returned.
+     </p>
+ </section>
+ 
+ <section>
+     <title>Example</title>
+     <p>
+This example will return the tuple (192.168.1.5,8020).
+     </p>
+ <source>
+REGEX_EXTRACT_ALL('192.168.1.5:8020', '(.*)\:(.*)');
+</source>
+     
+ </section>
+
+</section>
+
+
+<!-- ======================================================== -->  
+ <section>
+   <title>REPLACE</title>
+   <p>Replaces existing characters in a string with new characters.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>REPLACE(string, 'oldChar', 'newChar');</p>
+            </td>
+         </tr>  
+   </table>
+ </section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>string</p>
+            </td>
+            <td>
+               <p>The string to be updated.</p>
+            </td>
+         </tr> 
+                <tr>
+            <td>
+               <p>'oldChar'</p>
+            </td>
+            <td>
+               <p>The existing characters being replaced, in quotes. </p>
+            </td>
+         </tr> 
+                <tr>
+            <td>
+               <p>'newChar'</p>
+            </td>
+            <td>
+               <p>The new characters replacing the existing characters, in quotes.</p>
+            </td>
+         </tr> 
+   </table>
+</section>
+
+<section>
+     <title>Usage</title>
+     <p>
+Use the REPLACE function to replace existing characters in a string with new characters.
+     </p>
+     <p>
+For example, to change "open source software" to "open source wiki" use this statement: 
+REPLACE(string,'software','wiki');
+     </p>
+</section>
+</section> 
+
+<!-- ======================================================== -->  
+ <section>
+   <title>STRSPLIT</title>
+   <p>Splits a string around matches of a given regular expression. </p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>STRSPLIT(string, regex, limit)</p>
+            </td>
+         </tr> 
+        
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>string</p>
+            </td>
+            <td>
+               <p>The string to be split.</p>
+            </td>
+         </tr> 
+                <tr>
+            <td>
+               <p>regex</p>
+            </td>
+            <td>
+               <p>The regular expression.</p>
+            </td>
+         </tr> 
+                <tr>
+            <td>
+               <p>Limit</p>
+            </td>
+            <td>
+               <p>The number of times the pattern (the compiled representation of the regular expression) is applied.</p>
+            </td>
+         </tr> 
+   </table>
+</section>
+
+<section>
+     <title>Usage</title>
+     <p>
+Use the STRSPLIT function to split a string around matches of a given regular expression.
+     </p>
+     <p>
+For example, given the string (open:source:software), STRSPLIT (string, ':',2) will return ((open,source:software)) and STRSPLIT (string, ':',3) will return ((open,source,software)).
+     </p>
+</section>
+</section> 
+
+<!-- ======================================================== -->  
+ <section>
+   <title>SUBSTRING</title>
+   <p>Returns a substring from a given string. </p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>SUBSTRING(string, startIndex, stopIndex)</p>
+            </td>
+         </tr> 
+        
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+          <tr>
+            <td>
+               <p>string</p>
+            </td>
+            <td>
+               <p>The string from which a substring will be extracted.</p>
+            </td>
+         </tr> 
+       <tr>
+            <td>
+               <p>startIndex</p>
+            </td>
+            <td>
+               <p>The index (type integer) of the first character of the substring.</p>
+               <p>The index of a string begins with zero (0).</p>
+            </td>
+         </tr> 
+                <tr>
+            <td>
+               <p>stopIndex</p>
+            </td>
+            <td>
+               <p>The index (type integer) of the character <em>following</em> the last character of the substring.</p>
+            </td>
+         </tr> 
+   </table>
+</section>
+
+<section>
+     <title>Usage</title>
+     <p>
+Use the SUBSTRING function to return a substring from a given string. 
+     </p>
+          <p>  
+Given a field named alpha whose value is ABCDEF, to return substring BCD use this statement: SUBSTRING(alpha,1,4). Note that 1 is the index of B (the first character of the substring) and  4 is the index of E  (the character <em>following</em> the last character of the substring).
+     </p>
+</section>
+</section> 
+
+<!-- ======================================================== -->  
+ <section>
+   <title>TRIM</title>
+   <p>Returns a copy of a string with leading and trailing white space removed.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>TRIM(expression)</p>
+            </td>
+         </tr> 
+        
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result is chararray. </p>
+            </td>
+         </tr> 
+   </table>
+</section>
+
+<section>
+     <title>Usage</title>
+     <p>
+Use the TRIM function to remove leading and trailing white space from a string.
+     </p>
+</section>
+</section> 
+
+<!-- ======================================================== -->  
+ <section>
+   <title>UCFIRST</title>
+   <p>Returns a string with the first character converted to upper case. </p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>UCFIRST(expression)</p>
+            </td>
+         </tr> 
+        
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result type is chararray.</p>
+            </td>
+         </tr> 
+   </table>
+</section>
+
+<section>
+     <title>Usage</title>
+     <p>
+Use the UCFIRST function to convert only the first character in a string to upper case. 
+     </p>
+</section>
+</section>
+
+<!-- ======================================================== -->  
+ <section>
+   <title>UPPER</title>
+   <p>Returns a string converted to upper case. </p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>UPPER(expression)</p>
+            </td>
+         </tr> 
+        
+   </table></section>
+   
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression whose result type is chararray. </p>
+            </td>
+         </tr> 
+   </table>
+</section>
+   
+<section>
+     <title>Usage</title>
+     <p>
+Use the UPPER function to convert all characters in a string to upper case.
+     </p>
+   </section>
+</section>
+ 
+</section>
+<!-- End String Functions -->
+
+
+<!-- ======================================================== -->
+<!-- ======================================================== -->
+<!-- Other Functions -->
+<section>
+<title>Bag and Tuple Functions</title>
+
+
+<!-- ======================================================== -->
+ <section>
+   <title>TOBAG</title>
+   <p>Converts one or more expressions to type bag. </p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>TOBAG(expression [, expression ...])</p>
+            </td>
+         </tr> 
+        
+   </table>
+ </section>
+
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression with any data type.</p>
+            </td>
+         </tr> 
+   </table>
+</section>
+
+<section>
+     <title>Usage</title>
+     <p>
+Use the TOBAG function to convert one or more expressions to individual tuples which are then placed in a bag.
+     </p>
+ </section>
+ 
+ <section>
+     <title>Example</title>
+     <p>
+In this example, fields f1 and f3 are converted to tuples that are then placed in a bag.
+     </p>
+ <source>
+a = LOAD 'student' AS (f1:chararray, f2:int, f3:float);
+DUMP a;
+
+(John,18,4.0)
+(Mary,19,3.8)
+(Bill,20,3.9)
+(Joe,18,3.8)
+
+b = FOREACH a GENERATE TOBAG(f1,f3);
+DUMP b;
+
+({(John),(4.0)})
+({(Mary),(3.8)})
+({(Bill),(3.9)})
+({(Joe),(3.8)})
+</source>
+     
+ </section>
+
+</section>
+
+ <!-- ======================================================== -->  
+ <section>
+   <title>TOP</title>
+   <p>Returns the top-n tuples from a bag of tuples.</p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>TOP(topN,column,relation)</p>
+            </td>
+         </tr> 
+        
+   </table>
+ </section>
+
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>topN</p>
+            </td>
+            <td>
+               <p>The number of top tuples to return (type integer).</p>
+            </td>
+         </tr> 
+                <tr>
+            <td>
+               <p>column</p>
+            </td>
+            <td>
+               <p>The tuple column whose values are being compared.</p>
+            </td>
+         </tr> 
+                <tr>
+            <td>
+               <p>relation</p>
+            </td>
+            <td>
+               <p>The relation (bag of tuples) containing the tuple column.</p>
+            </td>
+         </tr> 
+   </table>
+</section>
+
+<section>
+     <title>Usage</title>
+     <p>
+TOP function returns a bag containing top N tuples from the input bag where N is controlled by the first parameter to the function. The tuple comparison is performed based on a single column from the tuple. The column position is determined by the second parameter to the function. The function assumes that all tuples in the bag contain an element of the same type in the compared column
+     </p>
+</section>
+ 
+ <section>
+     <title>Example</title>
+     <p>
+In this example the top 10 occurrences are returned.
+     </p>
+ <source>
+A = LOAD 'data' as (first: chararray, second: chararray);
+B = GROUP A BY (first, second);
+C = FOREACH B generate FLATTEN(group), COUNT(*) as count;
+D = GROUP C BY first; // again group by first
+topResults = FOREACH D {
+    result = TOP(10, 2, C); // and retain top 10 occurrences of 'second' in first
+    GENERATE FLATTEN(result);
+}
+</source>
+     
+ </section>
+
+</section>
+<!-- ======================================================== -->  
+ <section>
+   <title>TOTUPLE</title>
+   <p>Converts one or more expressions to type tuple. </p>
+
+<section>
+   <title>Syntax</title>
+   <table>
+       <tr>
+            <td>
+               <p>TOTUPLE(expression [, expression ...])</p>
+            </td>
+         </tr> 
+        
+   </table>
+ </section>
+
+<section>
+   <title>Terms</title>
+   <table>
+       <tr>
+            <td>
+               <p>expression</p>
+            </td>
+            <td>
+               <p>An expression of any datatype.</p>
+            </td>
+         </tr> 
+   </table>
+</section>
+
+<section>
+     <title>Usage</title>
+     <p>
+Use the TOTUPLE function to convert one or more expressions to a tuple.
+     </p>
+ </section>
+ 
+ <section>
+     <title>Example</title>
+     <p>
+In this example, fields f1, f2 and f3 are converted to a tuple.
+     </p>
+ <source>
+a = LOAD 'student' AS (f1:chararray, f2:int, f3:float);
+DUMP a;
+
+(John,18,4.0)
+(Mary,19,3.8)
+(Bill,20,3.9)
+(Joe,18,3.8)
+
+b = FOREACH a GENERATE TOTUPLE(f1,f2,f3);
+DUMP b;
+
+((John,18,4.0))
+((Mary,19,3.8))
+((Bill,20,3.9))
+((Joe,18,3.8))
+</source>
+ </section>
+</section>
+
+</section>
+<!-- End Other Functions -->
+
+  </body>
+</document>

Modified: pig/trunk/src/docs/src/documentation/content/xdocs/index.xml
URL: http://svn.apache.org/viewvc/pig/trunk/src/docs/src/documentation/content/xdocs/index.xml?rev=1050082&r1=1050081&r2=1050082&view=diff
==============================================================================
--- pig/trunk/src/docs/src/documentation/content/xdocs/index.xml (original)
+++ pig/trunk/src/docs/src/documentation/content/xdocs/index.xml Thu Dec 16 18:10:59 2010
@@ -21,22 +21,17 @@
   <header>
     <title>Overview </title>
   </header>
-  
   <body>
-      <p>
-        The Pig Documentation provides the information you need to get started using Pig.
-      </p>
-      <p>
-        Begin with the <a href="setup.html"> Pig Setup</a> which shows you how to download and run Pig. 
-        Then try out the <a href="tutorial.html">Pig Tutorial</a> to get an idea of how easy it is to use Pig. 
-      </p>
-      <p>  
-        When you are ready to start writing your own scripts, read through the Pig Latin Reference <a href="piglatin_ref1.html">Manual 1</a>
-        and <a href="piglatin_ref2.html">Manual 2</a> to become familiar with Pig's features. 
-        Also review the <a href="cookbook.html">Pig Cookbook</a> to learn how to tweak your code for optimal performance.
-      </p>
-      <p>
-		If you have more questions, you can ask on the <a href="http://hadoop.apache.org/pig/mailing_lists.html">Pig Mailing Lists</a>.
-    </p>
+      <p>The Pig Documentation provides the information you need to get started using Pig.</p>  
+      
+      <p>Begin with the <a href="start.html">Getting Started</a> guide which shows you how to set up Pig and how to form simple Pig Latin statements.
+      When you are ready to start writing your own scripts, review the <a href="basic.html">Pig Latin Basics</a> manual to 
+        become familiar with the Pig Latin operators and the supported data types.</p>
+        
+      <p>Functions can be a part of almost every operator in Pig. The <a href="func.html">Built In Functions</a> guide describes Pig's built in functions.  
+       The <a href="udf.html">User Defined Functions</a> manual shows you how to how to write your own functions and how to access functions 
+       contributed by other Pig users. </p>
+      
+      <p>If you have more questions, you can ask on the <a href="http://hadoop.apache.org/pig/mailing_lists.html">Pig Mailing Lists</a>.</p>
   </body>
 </document>