You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tajo.apache.org by hy...@apache.org on 2013/11/22 15:57:59 UTC

svn commit: r1544563 [3/4] - /incubator/tajo/site/

Added: incubator/tajo/site/tajo-0.2.0-doc.html
URL: http://svn.apache.org/viewvc/incubator/tajo/site/tajo-0.2.0-doc.html?rev=1544563&view=auto
==============================================================================
--- incubator/tajo/site/tajo-0.2.0-doc.html (added)
+++ incubator/tajo/site/tajo-0.2.0-doc.html Fri Nov 22 14:57:58 2013
@@ -0,0 +1,2034 @@
+<!DOCTYPE html>
+<!--
+ | Generated by Apache Maven Doxia at 2013-11-22
+ | Rendered using Apache Maven Fluido Skin 1.3.0
+-->
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <meta name="Date-Revision-yyyymmdd" content="20131122" />
+    <meta http-equiv="Content-Language" content="en" />
+    <title>Tajo - A Big Data Warehouse System on Hadoop - </title>
+    <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
+    <link rel="stylesheet" href="./css/site.css" />
+    <link rel="stylesheet" href="./css/print.css" media="print" />
+
+      
+    
+    
+  
+    <script type="text/javascript" src="./js/apache-maven-fluido-1.3.0.min.js"></script>
+
+                          
+        
+<meta content="Apache Tajo,big data warehouse system on Hadoop,relational and distributed query engine" name="description"/>
+                      
+        
+<script type="text/javascript">var _gaq = _gaq || [];
+        _gaq.push(['_setAccount', 'UA-38152529-1']);
+        _gaq.push(['_trackPageview']);
+
+        (function() {
+        var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+        ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+        var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+        })();</script>
+                      
+        
+<style>a.externalLink[href^=http] {
+          background-image: none;
+          padding-right: 0;
+        }</style>
+          
+            </head>
+        <body class="topBarEnabled">
+          
+                
+                    
+                
+
+    <div id="topbar" class="navbar navbar-fixed-top ">
+      <div class="navbar-inner">
+                <div class="container-fluid">
+        <a data-target=".nav-collapse" data-toggle="collapse" class="btn btn-navbar">
+          <span class="icon-bar"></span>
+          <span class="icon-bar"></span>
+          <span class="icon-bar"></span>
+        </a>
+                
+                                <ul class="nav">
+                          <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Tajo <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="index.html"  title="Overview">Overview</a>
+</li>
+                  
+                      <li>      <a href="http://www.apache.org/licenses/"  title="License">License</a>
+</li>
+                  
+                      <li>      <a href="downloads.html"  title="Downloads">Downloads</a>
+</li>
+                  
+                      <li>      <a href="tajo-0.2.0-doc.html#Tutorial"  title="Getting Started">Getting Started</a>
+</li>
+                          </ul>
+      </li>
+                <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Community <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="http://wiki.apache.org/tajo"  title="Wiki">Wiki</a>
+</li>
+                  
+                      <li>      <a href="team-list.html"  title="Team">Team</a>
+</li>
+                  
+                      <li>      <a href="mail-lists.html"  title="Mailing Lists">Mailing Lists</a>
+</li>
+                  
+                      <li>      <a href="https://issues.apache.org/jira/browse/TAJO"  title="Issue Tracker">Issue Tracker</a>
+</li>
+                  
+                      <li>      <a href="http://wiki.apache.org/tajo/PoweredBy"  title="Powered By">Powered By</a>
+</li>
+                  
+                      <li>      <a href="http://wiki.apache.org/tajo/Presentations"  title="Presentations">Presentations</a>
+</li>
+                          </ul>
+      </li>
+                <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Documentation <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="tajo-0.8.0-doc.html"  title="0.8.0-SNAPSHOT (Dev)">0.8.0-SNAPSHOT (Dev)</a>
+</li>
+                  
+                      <li>      <a href="tajo-0.2.0-doc.html"  title="0.2.0-incubating (Current)">0.2.0-incubating (Current)</a>
+</li>
+                          </ul>
+      </li>
+                <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Contributes <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="source-code.html"  title="Source Code">Source Code</a>
+</li>
+                  
+                      <li>      <a href="http://www.apache.org/foundation/getinvolved.html"  title="Get Involved">Get Involved</a>
+</li>
+                  
+                      <li>      <a href="http://wiki.apache.org/tajo/HowToContribute"  title="How To Contribute">How To Contribute</a>
+</li>
+                  
+                      <li>      <a href="https://reviews.apache.org/groups/tajo"  title="Review Board">Review Board</a>
+</li>
+                  
+                      <li>      <a href="http://wiki.apache.org/tajo/CodingStyle"  title="Coding Style">Coding Style</a>
+</li>
+                          </ul>
+      </li>
+                  </ul>
+          
+          
+          
+                               <ul class="nav pull-right">
+              <li class="dropdown">
+                <a href="#" class="dropdown-toggle" data-toggle="dropdown">External Links <b class="caret"></b></a>
+                <ul class="dropdown-menu">
+                      <li>      <a href="https://git-wip-us.apache.org/repos/asf/incubator-tajo.git"  title="GIT">GIT</a>
+</li>
+                  </ul>
+              </li>
+            </ul>
+          
+                      </div>
+          
+        </div>
+      </div>
+    </div>
+    
+        <div class="container-fluid">
+          <div id="banner">
+        <div class="pull-left">
+                                    <a href="http://tajo.incubator.apache.org" id="bannerLeft">
+                                                                                                <img src="./images/tajo_logo.png"  alt="Apache Tajo" width="240"/>
+                </a>
+                      </div>
+        <div class="pull-right">                  <a href="http://incubator.apache.org/" id="bannerRight">
+                                                                                        <img src="http://incubator.apache.org/images/egg-logo.png"  alt="Apache Incubator"/>
+                </a>
+      </div>
+        <div class="clear"><hr/></div>
+      </div>
+
+      <div id="breadcrumbs">
+        <ul class="breadcrumb">
+                
+                    
+                              <li class="">
+                    <a href="http://www.apache.org" class="externalLink" title="Apache">
+        Apache</a>
+        </li>
+      <li class="divider ">/</li>
+            <li class="">
+                    <a href="../" title="Incubator">
+        Incubator</a>
+        </li>
+      <li class="divider ">/</li>
+                <li class="">
+                    <a href="./" title="Tajo">
+        Tajo</a>
+        </li>
+      <li class="divider ">/</li>
+        <li class=""></li>
+        
+                
+                    
+                  <li id="publishDate" class="pull-right">Last Published: 2013-11-22</li> <li class="divider pull-right">|</li>
+              <li id="projectVersion" class="pull-right">Version: 0.8.0-SNAPSHOT</li>
+            
+                            </ul>
+      </div>
+
+            
+      <div class="row-fluid">
+        <div id="leftColumn" class="span3">
+          <div class="well sidebar-nav">
+                
+                    
+                <ul class="nav nav-list">
+                    <li class="nav-header">Tajo</li>
+                                
+      <li>
+    
+                          <a href="index.html" title="Overview">
+          <i class="none"></i>
+        Overview</a>
+            </li>
+                  
+      <li>
+    
+                          <a href="http://www.apache.org/licenses/" class="externalLink" title="License">
+          <i class="none"></i>
+        License</a>
+            </li>
+                  
+      <li>
+    
+                          <a href="downloads.html" title="Downloads">
+          <i class="none"></i>
+        Downloads</a>
+            </li>
+                  
+      <li>
+    
+                          <a href="tajo-0.2.0-doc.html#Tutorial" title="Getting Started">
+          <i class="none"></i>
+        Getting Started</a>
+            </li>
+                              <li class="nav-header">Community</li>
+                                
+      <li>
+    
+                          <a href="http://wiki.apache.org/tajo" class="externalLink" title="Wiki">
+          <i class="none"></i>
+        Wiki</a>
+            </li>
+                  
+      <li>
+    
+                          <a href="team-list.html" title="Team">
+          <i class="none"></i>
+        Team</a>
+            </li>
+                  
+      <li>
+    
+                          <a href="mail-lists.html" title="Mailing Lists">
+          <i class="none"></i>
+        Mailing Lists</a>
+            </li>
+                  
+      <li>
+    
+                          <a href="https://issues.apache.org/jira/browse/TAJO" class="externalLink" title="Issue Tracker">
+          <i class="none"></i>
+        Issue Tracker</a>
+            </li>
+                  
+      <li>
+    
+                          <a href="http://wiki.apache.org/tajo/PoweredBy" class="externalLink" title="Powered By">
+          <i class="none"></i>
+        Powered By</a>
+            </li>
+                  
+      <li>
+    
+                          <a href="http://wiki.apache.org/tajo/Presentations" class="externalLink" title="Presentations">
+          <i class="none"></i>
+        Presentations</a>
+            </li>
+                              <li class="nav-header">Documentation</li>
+                                
+      <li>
+    
+                          <a href="tajo-0.8.0-doc.html" title="0.8.0-SNAPSHOT (Dev)">
+          <i class="none"></i>
+        0.8.0-SNAPSHOT (Dev)</a>
+            </li>
+                  
+      <li class="active">
+    
+            <a href="#"><i class="none"></i>0.2.0-incubating (Current)</a>
+          </li>
+                              <li class="nav-header">Contributes</li>
+                                
+      <li>
+    
+                          <a href="source-code.html" title="Source Code">
+          <i class="none"></i>
+        Source Code</a>
+            </li>
+                  
+      <li>
+    
+                          <a href="http://www.apache.org/foundation/getinvolved.html" class="externalLink" title="Get Involved">
+          <i class="none"></i>
+        Get Involved</a>
+            </li>
+                  
+      <li>
+    
+                          <a href="http://wiki.apache.org/tajo/HowToContribute" class="externalLink" title="How To Contribute">
+          <i class="none"></i>
+        How To Contribute</a>
+            </li>
+                  
+      <li>
+    
+                          <a href="https://reviews.apache.org/groups/tajo" class="externalLink" title="Review Board">
+          <i class="none"></i>
+        Review Board</a>
+            </li>
+                  
+      <li>
+    
+                          <a href="http://wiki.apache.org/tajo/CodingStyle" class="externalLink" title="Coding Style">
+          <i class="none"></i>
+        Coding Style</a>
+            </li>
+            </ul>
+                
+                    
+                
+          <hr class="divider" />
+
+           <div id="poweredBy">
+                            <div class="clear"></div>
+                            <div class="clear"></div>
+               
+        
+        
+        <div id="twitter">
+    
+    <a href="https://twitter.com/ApacheTajo" class="twitter-follow-button" data-show-count="false" data-align="left" data-size="medium" data-show-screen-name="true" data-lang="en">Follow ApacheTajo</a>
+    <script type="text/javascript">!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs");</script>
+
+        </div>
+                   <div class="clear"></div>
+                             <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
+        <img class="builtBy" alt="Built by Maven" src="./images/logos/maven-feather.png" />
+      </a>
+                  </div>
+          </div>
+        </div>
+        
+                
+        <div id="bodyColumn"  class="span9" >
+                                  
+            <!-- Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. --><h1>Apache Tajo 0.2.0-incubating Release Documentation</h1>
+
+<ul>
+  
+<li>Last Updated Date: 2013.11.21</li>
+</ul>
+<div class="section">
+<h2>Table of Contents<a name="Table_of_Contents"></a></h2>
+
+<ul>
+  
+<li><a href="#WhatIsApacheTajo">What is Apache Tajo?</a></li>
+  
+<li><a href="#GettingStarted">Tutorial - Getting Started</a>
+  
+<ul>
+    
+<li><a href="#Prerequisite">Prerequisite</a></li>
+    
+<li><a href="#Download">Download</a>
+    
+<ul>
+      
+<li><a href="#BinaryDownload">Binary Download</a></li>
+      
+<li><a href="#SourceDownload">Source Download</a></li>
+    </ul></li>
+    
+<li><a href="#Installation">Installation</a>
+    
+<ul>
+      
+<li><a href="#UnpackTarball">Unpack tarball</a></li>
+      
+<li><a href="#SetupATajoCluster">Setup a Tajo Cluster</a></li>
+      
+<li><a href="#LaunchATajoCluster">Launch a Tajo Cluster</a></li>
+    </ul></li>
+    
+<li><a href="#FirstQueryExecution">First Query Execution</a></li>
+    
+<li><a href="#DistributedMode">Distributed mode on HDFS cluster</a></li>
+  </ul></li>
+  
+<li><a href="#Configuration">Configuration</a>
+  
+<ul>
+    
+<li><a href="#Preliminary">Preliminary</a>
+    
+<ul>
+      
+<li><a href="#catalog-site_and_tajo-site">catalog-site.xml and tajo-site.xml</a></li>
+    </ul></li>
+    
+<li><a href="#TajoMasterConfiguration">TajoMaster Configuration</a>
+    
+<ul>
+      
+<li><a href="#TajoRootDir">Tajo Rootdir Setting</a></li>
+      
+<li><a href="#TajoMasterHeap">TajoMaster Heap Memory Size</a></li>
+    </ul></li>
+    
+<li><a href="#TajoWorkerConfiguration">Tajo Worker Configuration</a>
+    
+<ul>
+      
+<li><a href="#TajoMasterHeap">TajoMaster Heap Memory Size</a></li>
+      
+<li><a href="#TemporaryDataDir">Temporary Data Directory</a></li>
+      
+<li><a href="#MaximumParallelRunningTasks">Maximum number of parallel running tasks for each worker</a></li>
+    </ul></li>
+    
+<li><a href="#CatalogConfiguration">Catalog Configuration</a></li>
+    
+<li><a href="#DefaultPorts">RPC/Http Service Configuration and Default Addresses</a>
+    
+<ul>
+      
+<li><a href="#TajoMasterDefaultPorts">Tajo Master</a></li>
+      
+<li><a href="#TajoWorkerDefaultPorts">Worker</a></li>
+    </ul></li>
+  </ul></li>
+  
+<li><a href="#CommandLineInterface">Command Line Interface (tsql)</a>
+  
+<ul>
+    
+<li><a href="#EnteringTsql">Entering tsql shell</a></li>
+    
+<li><a href="#MetaCommands">Meta Commands</a></li>
+    
+<li><a href="#CLI_Examples">Examples</a></li>
+  </ul></li>
+  
+<li><a href="#DataModel">Data Model</a>
+  
+<ul>
+    
+<li><a href="#DataTypes">Data Types</a>
+    
+<ul>
+      
+<li><a href="#UsingRealNumberValue">Using real number value</a></li>
+    </ul></li>
+  </ul></li>
+  
+<li><a href="#SQLLanguage">SQL Language</a>
+  
+<ul>
+    
+<li><a href="#DDL">Data Definition Language (DDL)</a>
+    
+<ul>
+      
+<li><a href="#CreateTable">CREATE TABLE</a></li>
+      
+<li><a href="#DDLCompression">Compression</a></li>
+    </ul></li>
+    
+<li><a href="#DML">Data Manipulation Language (DML)</a>
+    
+<ul>
+      
+<li><a href="#SQLExpressions">SQL Expressions</a>
+      
+<ul>
+        
+<li><a href="#ArithmeticExpressions">Arithmetic Expressions</a></li>
+        
+<li><a href="#TypeCasts">Type Casts</a></li>
+        
+<li><a href="#StringExpressions">String Expressions</a></li>
+        
+<li><a href="#FunctionCall">Function Call</a></li>
+      </ul></li>
+      
+<li><a href="#Select">SELECT</a></li>
+      
+<li><a href="#Where">WHERE</a>
+      
+<ul>
+        
+<li><a href="#InPredicate">IN Predicate</a></li>
+        
+<li><a href="#StringPatternMatching">String Pattern Matching Predicates</a> (LIKE, ILIKE, SIMILAR TO, REGULAR EXPRESSIONS)</li>
+      </ul></li>
+      
+<li><a href="#InsertOverwrite">INSERT (OVERWRITE) INTO</a></li>
+    </ul></li>
+    
+<li><a href="#Functions">Functions</a>
+    
+<ul>
+      
+<li><a href="#StandardFunctions">Standard Functions</a></li>
+      
+<li><a href="#StringFunctions">String Functions</a></li>
+    </ul></li>
+  </ul></li>
+  
+<li><a href="#Administration">Administration</a>
+  
+<ul>
+    
+<li><a href="#CatalogBackup">Catalog Backup</a>
+    
+<ul>
+      
+<li><a href="#SQLDump">SQL dump</a></li>
+      
+<li><a href="#DatabaseLevelBackup">Database-level Backup</a></li>
+    </ul></li>
+  </ul></li>
+</ul>
+<h1><a name="WhatIsApacheTajo"></a> What is Apache Tajo?</h1>
+<p>Tajo is <i><b>a big daga warehouse system on Hadoop</b></i> that provides low-latency and scalable ad-hoc queries and ETL on large-data sets stored on HDFS and other data sources.</p>
+<h1><a name="GettingStarted"></a>Tutorial - Getting Started</h1></div>
+<div class="section">
+<h2><a name="Prerequisite"></a>Prerequisite</h2>
+
+<ul>
+  
+<li>Hadoop 2.0.3-alpha or 2.0.5-alpha</li>
+  
+<li>Java 1.6 or higher</li>
+  
+<li>Protocol buffer 2.4.1</li>
+</ul></div>
+<div class="section">
+<h2><a name="Download"></a>Download</h2>
+<div class="section">
+<h3><a name="BinaryDownload"></a>Binary Download<a name="Binary_Download"></a></h3>
+<p>Download the source code from <a class="externalLink" href="http://tajo.incubator.apache.org/downloads.html">http://tajo.incubator.apache.org/downloads.html</a>.</p></div>
+<div class="section">
+<h3><a name="SourceDownload"></a>Source Download<a name="Source_Download"></a></h3>
+<p>Download the source code and build Tajo as follows:</p>
+
+<div class="source">
+<pre>$ git clone https://git-wip-us.apache.org/repos/asf/incubator-tajo.git tajo
+</pre></div></div></div>
+<div class="section">
+<h2><a name="BuildSourceCode"></a>Build Source Code<a name="Build_Source_Code"></a></h2>
+<p>You can compile source code and get a binary archive as follows:</p>
+
+<div class="source">
+<pre>$ cd tajo
+$ mvn clean package -DskipTests -Pdist -Dtar
+$ ls tajo-dist/target/tajo-x.y.z-SNAPSHOT.tar.gz
+</pre></div></div>
+<div class="section">
+<h2><a name="Installation"></a>Installation</h2>
+<div class="section">
+<h3><a name="UnpackTarball"></a>Unpack tarball<a name="Unpack_tarball"></a></h3>
+<p>You should unpack the tarball (refer to build instruction).</p>
+
+<div class="source">
+<pre>$ tar xzvf tajo-0.2.0-SNAPSHOT.tar.gz
+</pre></div>
+<p>This will result in the creation of subdirectory named tajo-x.y.z-SNAPSHOT. You MUST copy this directory into the same directory on all cluster nodes.</p></div>
+<div class="section">
+<h3><a name="SetupATajoCluster"></a>Setup a Tajo cluster<a name="Setup_a_Tajo_cluster"></a></h3>
+<p>First of all, you need to add the environment variables to conf/tajo-env.sh.</p>
+
+<div class="source">
+<pre># Hadoop home. Required
+export HADOOP_HOME= ...
+
+# The java implementation to use.  Required.
+export JAVA_HOME= ...
+</pre></div></div></div>
+<div class="section">
+<h2><a name="LaunchATajoCluster"></a>Launch a Tajo cluster<a name="Launch_a_Tajo_cluster"></a></h2>
+<p>To launch the tajo master, execute start-tajo.sh.</p>
+
+<div class="source">
+<pre>$ $TAJO_HOME/bin/start-tajo.sh
+</pre></div>
+<p>After then, you can use tajo-cli to access the command line interface of Tajo. If you want to how to use tsql, read Tajo Interactive Shell document.</p>
+
+<div class="source">
+<pre>$ $TAJO_HOME/bin/tsql
+</pre></div>
+<p>If you type ? on tsql, you can see help documentation. </p></div>
+<div class="section">
+<h2><a name="FirstQueryExecution"></a>First Query Execution<a name="First_Query_Execution"></a></h2>
+<p>First of all, we need to prepare some data for query execution. For example, you can make a simple text-based table as follows:</p>
+
+<div class="source">
+<pre>$ mkdir /home/x/table1
+$ cd /home/x/table1
+$ cat &gt; data.csv
+1|abc|1.1|a
+2|def|2.3|b
+3|ghi|3.4|c
+4|jkl|4.5|d
+5|mno|5.6|e
+&lt;CTRL + D&gt;
+</pre></div>
+<p>This schema of this table is (int, text, float, text).</p>
+
+<div class="source">
+<pre>$ $TAJO_HOME/bin/tsql
+
+tajo&gt; create external table table1 (id int, name text, score float, type text) using csv with ('csvfile.delimiter'='|') location 'file:/home/x/table1';
+</pre></div>
+<p>In order to load an external table, you need to use &#x2018;create external table&#x2019; statement. In the location clause, you should use the absolute directory path with an appropriate scheme. If the table resides in HDFS, you should use &#x2018;hdfs&#x2019; instead of &#x2018;file&#x2019;.</p>
+<p>If you want to know DDL statements in more detail, please see Query Language. </p>
+
+<div class="source">
+<pre>tajo&gt; \d
+table1
+</pre></div>
+<p>&#x2018;d&#x2019; command shows the list of tables.</p>
+
+<div class="source">
+<pre>tajo&gt; \d table1
+
+table name: table1
+table path: file:/home/x/table1
+store type: CSV
+number of rows: 0
+volume (bytes): 78 B
+schema:
+id      INT
+name    TEXT
+score   FLOAT
+type    TEXT
+</pre></div>
+<p>&#x2018;d [table name]&#x2019; command shows the description of a given table.</p>
+<p>Also, you can execute SQL queries as follows: </p>
+
+<div class="source">
+<pre>tajo&gt; select * from table1 where id &gt; 2;
+final state: QUERY_SUCCEEDED, init time: 0.069 sec, response time: 0.397 sec
+result: file:/tmp/tajo-hadoop/staging/q_1363768615503_0001_000001/RESULT, 3 rows ( 35B)
+
+id,  name,  score,  type
+- - - - - - - - - -  - - -
+3,  ghi,  3.4,  c
+4,  jkl,  4.5,  d
+5,  mno,  5.6,  e
+
+tajo&gt;
+</pre></div></div>
+<div class="section">
+<h2><a name="DistributedMode"></a>Distributed mode on HDFS cluster<a name="Distributed_mode_on_HDFS_cluster"></a></h2>
+<p>Add the following configs to tajo-site.xml file.</p>
+
+<div class="source">
+<pre>  &lt;property&gt;
+    &lt;name&gt;tajo.rootdir&lt;/name&gt;
+    &lt;value&gt;hdfs://hostname:port/tajo&lt;/value&gt;
+  &lt;/property&gt;
+
+  &lt;property&gt;
+    &lt;name&gt;tajo.master.umbilical-rpc.address&lt;/name&gt;
+    &lt;value&gt;hostname:26001&lt;/value&gt;
+  &lt;/property&gt;
+
+  &lt;property&gt;
+    &lt;name&gt;tajo.catalog.client-rpc.address&lt;/name&gt;
+    &lt;value&gt;hostname:26005&lt;/value&gt;
+  &lt;/property&gt;
+</pre></div>
+<p>If you want to know Tajo&#x2019;s configuration in more detail, see Configuration page.</p>
+<p>Before launching the tajo, you should create the tajo root dir and set the permission as follows:</p>
+
+<div class="source">
+<pre>$ $HADOOP_HOME/bin/hadoop fs -mkdir       /tajo
+$ $HADOOP_HOME/bin/hadoop fs -chmod g+w   /tajo
+</pre></div>
+<p>Then, execute start-tajo.sh</p>
+
+<div class="source">
+<pre>$ $TAJO_HOME/bin/start-tajo.sh
+</pre></div>
+<p>Enjoy Apache Tajo!</p>
+<h1><a name="Configuration"></a>Configuration</h1></div>
+<div class="section">
+<h2><a name="Preliminary"></a>Preliminary</h2>
+<div class="section">
+<h3><a name="catalog-site_and_tajo-site"></a>catalog-site.xml and tajo-site.xml<a name="catalog-site.xml_and_tajo-site.xml"></a></h3>
+<p>Tajo&#x2019;s configuration is based on Hadoop&#x2019;s configuration system. Tajo uses two config files:</p>
+
+<ul>
+  
+<li>catalog-site.xml - configuration for the catalog server.</li>
+  
+<li>tajo-site.xml - configuration for other tajo modules.</li>
+</ul>
+<p>Each config consists of a pair of a name and a value. If you want to set the config name a.b.c with the value 123, add the following element to an appropriate file.</p>
+
+<div class="source">
+<pre>  &lt;property&gt;
+    &lt;name&gt;a.b.c&lt;/name&gt;
+    &lt;value&gt;123&lt;/value&gt;
+  &lt;/property&gt;
+</pre></div>
+<p>Tajo has a variety of internal configs. If you don&#x2019;t set some config explicitly, the default config will be used for for that config. Tajo is designed to use only a few of configs in usual cases. You may not be concerned with the configuration.</p>
+<p>In default, there is no tajo-site.xml in ${TAJO}/conf directory. If you set some configs, first copy $TAJO_HOME/conf/tajo-site.xml.templete to tajo-site.xml. Then, add the configs to your tajo-site.</p></div>
+<div class="section">
+<h3><a name="tajo-env"></a>tajo-env.sh<a name="tajo-env.sh"></a></h3>
+<p>tajo-env.sh is a shell script file. The main purpose of this file is to set shell environment variables for TajoMaster and TajoWorker java program. So, you can set some variable as follows:</p>
+
+<div class="source">
+<pre>VARIABLE=value
+</pre></div>
+<p>If a value is a literal string, type this as follows:</p>
+
+<div class="source">
+<pre>VARIABLE='value'
+</pre></div></div>
+<div class="section">
+<h3><a name="TajoMasterConfiguration"></a>TajoMaster Configuration<a name="TajoMaster_Configuration"></a></h3>
+<div class="section">
+<h4><a name="TajoRootDir"></a>Tajo Rootdir Setting<a name="Tajo_Rootdir_Setting"></a></h4>
+<p>Tajo uses HDFS as a primary storage layer. So, one Tajo cluster instance should have one tajo rootdir. A user is allowed to specific your tajo rootdir as follows:</p>
+
+<div class="source">
+<pre>  &lt;property&gt;
+    &lt;name&gt;tajo.rootdir&lt;/name&gt;
+    &lt;value&gt;hdfs://namenode_hostname:port/path&lt;/value&gt;
+  &lt;/property&gt;
+</pre></div>
+<p>Tajo rootdir must be a url form like <tt>scheme://hostname:port/path</tt>. The current implementaion only supports <tt>hdfs://</tt> and <tt>file://</tt> schemes. The default value is <tt>file:///tmp/tajo-${user.name}/</tt>.</p></div>
+<div class="section">
+<h4><a name="TajoMasterHeap"></a>TajoMaster Heap Memory Size<a name="TajoMaster_Heap_Memory_Size"></a></h4>
+<p>The environment variable TAJO_MASTER_HEAPSIZE in conf/tajo-env.sh allow Tajo Master to use the specified heap memory size.</p>
+<p>If you want to adjust heap memory size, set TAJO_MASTER_HEAPSIZE variable in conf/tajo-env.sh with a proper size as follows:</p>
+
+<div class="source">
+<pre>TAJO_MASTER_HEAPSIZE=2000
+</pre></div>
+<p>The default size is 1000 (1GB). </p></div></div></div>
+<div class="section">
+<h2><a name="TajoWorkerConfiguration"></a>Tajo Worker Configuration<a name="Tajo_Worker_Configuration"></a></h2>
+<div class="section">
+<h3><a name="WorkerHeap"></a>Worker Heap Memory Size<a name="Worker_Heap_Memory_Size"></a></h3>
+<p>The environment variable TAJO_WORKER_HEAPSIZE in conf/tajo-env.sh allow Tajo Worker to use the specified heap memory size.</p>
+<p>If you want to adjust heap memory size, set TAJO_WORKER_HEAPSIZE variable in conf/tajo-env.sh with a proper size as follows:</p>
+
+<div class="source">
+<pre>TAJO_WORKER_HEAPSIZE=8000
+</pre></div>
+<p>The default size is 1000 (1GB).</p></div>
+<div class="section">
+<h3><a name="TemporaryDataDir"></a>Temporary Data Directory<a name="Temporary_Data_Directory"></a></h3>
+<p>TajoWorker stores temporary data on local file system due to out-of-core algorithms. It is possible to specify one or more temporary data directories where temporary data will be stored.</p>
+<p><i>tajo-site.xml</i></p>
+
+<div class="source">
+<pre>  &lt;property&gt;
+    &lt;name&gt;tajo.worker.tmpdir.locations&lt;/name&gt;
+    &lt;value&gt;/disk1/tmpdir,/disk2/tmpdir,/disk3/tmpdir&lt;/value&gt;
+  &lt;/property&gt;
+</pre></div></div>
+<div class="section">
+<h3><a name="MaximumParallelRunningTasks"></a>Maximum number of parallel running tasks for each worker<a name="Maximum_number_of_parallel_running_tasks_for_each_worker"></a></h3>
+<p>Each worker can execute multiple tasks at a time. Tajo allows users to specify the maximum number of parallel running tasks for each worker.</p>
+<p><i>tajo-site.xml</i></p>
+
+<div class="source">
+<pre>  &lt;property&gt;
+    &lt;name&gt;tajo.worker.parallel-execution.max-num&lt;/name&gt;
+    &lt;value&gt;12&lt;/value&gt;
+  &lt;/property&gt;
+</pre></div></div></div>
+<div class="section">
+<h2><a name="CatalogConfiguration"></a>Catalog Configuration<a name="Catalog_Configuration"></a></h2>
+<p>If you want to customize the catalog service, copy $TAJO_HOME/conf/catalog-site.xml.templete to catalog-site.xml. Then, add the following configs to catalog-site.xml. Note that the default configs are enough to launch Tajo cluster in most cases.</p>
+
+<ul>
+  
+<li>tajo.catalog.master.addr - If you want to launch a Tajo cluster in distributed mode, you must specify this address. For more detail information, see <a href="#DefaultPorts">Default Ports</a>.</li>
+  
+<li>tajo.catalog.store.class - If you want to change the persistent storage of the catalog server, specify the class name. Its default value is tajo.catalog.store.DerbyStore. In the current version, Tajo provides three persistent storage classes as follows:
+  
+<ul>
+    
+<li>tajo.catalog.store.DerbyStore - this storage class uses Apache Derby.</li>
+    
+<li>tajo.catalog.store.MySQLStore - this storage class uses MySQL.</li>
+    
+<li>tajo.catalog.store.MemStore - this is the in-memory storage. It is only used in unit tests to shorten the duration of unit tests.</li>
+  </ul></li>
+</ul></div>
+<div class="section">
+<h2><a name="DefaultPorts"></a>RPC/Http Service Configuration and Default Addresses<a name="RPCHttp_Service_Configuration_and_Default_Addresses"></a></h2>
+<div class="section">
+<h3><a name="TajoMasterDefaultPorts"></a>Tajo Master<a name="Tajo_Master"></a></h3>
+
+<table border="0" class="table table-striped">
+  <thead>
+    
+<tr class="a">
+      
+<th>Service Name </th>
+      
+<th>Config Property Name </th>
+      
+<th>Description </th>
+      
+<th>default address </th>
+    </tr>
+  </thead>
+  <tbody>
+    
+<tr class="b">
+      
+<td>Tajo Master Umbilical Rpc </td>
+      
+<td>tajo.master.umbilical-rpc.address </td>
+      
+<td> </td>
+      
+<td>localhost:26001 </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>Tajo Master Client Rpc </td>
+      
+<td>tajo.master.client-rpc.address </td>
+      
+<td> </td>
+      
+<td>localhost:26002 </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>Tajo Master Info Http </td>
+      
+<td>tajo.master.info-http.address </td>
+      
+<td> </td>
+      
+<td>0.0.0.0:26080 </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>Tajo Catalog Client Rpc </td>
+      
+<td>tajo.catalog.client-rpc.address </td>
+      
+<td> </td>
+      
+<td>localhost:26005 </td>
+    </tr>
+  </tbody>
+</table></div>
+<div class="section">
+<h3><a name="TajoWorkerDefaultPorts"></a>Worker<a name="Worker"></a></h3>
+
+<table border="0" class="table table-striped">
+  <thead>
+    
+<tr class="a">
+      
+<th>Service Name </th>
+      
+<th>Config Property Name </th>
+      
+<th>Description </th>
+      
+<th>default address </th>
+    </tr>
+  </thead>
+  <tbody>
+    
+<tr class="b">
+      
+<td>Tajo Worker Peer Rpc </td>
+      
+<td>tajo.worker.peer-rpc.address </td>
+      
+<td> </td>
+      
+<td>0.0.0.0:28091 </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>Tajo Worker Client Rpc </td>
+      
+<td>tajo.worker.client-rpc.address </td>
+      
+<td> </td>
+      
+<td>0.0.0.0:28092 </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>Tajo Worker Info Http </td>
+      
+<td>tajo.worker.info-http.address </td>
+      
+<td> </td>
+      
+<td>0.0.0.0:28080 </td>
+    </tr>
+  </tbody>
+</table>
+<h1><a name="CommandLineInterface"></a>Command Line Interface (tsql)</h1>
+<p><i>Synopsis</i></p>
+
+<div class="source">
+<pre>bin/tsql [options]
+</pre></div>
+<p>Options</p>
+
+<ul>
+  
+<li>
+<p><tt>-c &quot;quoted sql&quot;</tt> : Execute quoted sql statements, and then the shell will exist.</p></li>
+  
+<li>
+<p><tt>-f filename (--file filename)</tt> : Use the file named filename as the source of commands instead of interactive shell.</p></li>
+  
+<li>
+<p><tt>-h hostname (--host hostname)</tt> : Specifies the host name of the machine on which the Tajo master is running.</p></li>
+  
+<li>
+<p><tt>-p port (--port port)</tt> : Specifies the TCP port. If it is not set, the port will be 26002 in default. </p></li>
+</ul></div></div>
+<div class="section">
+<h2><a name="EnteringTsql"></a>Entering tsql shell<a name="Entering_tsql_shell"></a></h2>
+<p>If the hostname and the port num are not given, tsql will try to connect the Tajo master specified in ${TAJO_HOME}/conf/tajo-site.xml.</p>
+
+<div class="source">
+<pre>bin/tsql
+
+tajo&gt;
+</pre></div>
+<p>If you want to connect a specified TajoMaster, you should use &#x2018;-h&#x2019; and (or) &#x2018;p&#x2019; options as follows:</p>
+
+<div class="source">
+<pre>bin/tsql -h localhost -p 9004
+
+tajo&gt; 
+</pre></div></div>
+<div class="section">
+<h2><a name="MetaCommands"></a>Meta Commands<a name="Meta_Commands"></a></h2>
+<p>In tsql, anything command that begins with an unquoted backslash (&#x2019;') is a tsql meta-command that is processed by tsql itself.</p>
+<p>In the current implementation, there are meta commands as follows:</p>
+
+<div class="source">
+<pre>tajo&gt; \?
+
+General
+  \copyright  show Apache License 2.0
+  \version    show Tajo version
+  \?          show help
+  \q          quit tsql
+
+
+Informational
+  \d         list tables
+  \d  NAME   describe table
+
+
+Documentations
+  tsql guide        http://wiki.apache.org/tajo/tsql
+  Query language    http://wiki.apache.org/tajo/QueryLanguage
+  Functions         http://wiki.apache.org/tajo/Functions
+  Backup &amp; restore  http://wiki.apache.org/tajo/BackupAndRestore
+  Configuration     http://wiki.apache.org/tajo/Configuration
+</pre></div></div>
+<div class="section">
+<h2><a name="CLI_Examples"></a>Examples<a name="Examples"></a></h2>
+<p>If you want to list all table names, use &#x2018;d&#x2019; meta command as follows:</p>
+
+<div class="source">
+<pre>tajo&gt; \d
+customer
+lineitem
+nation
+orders
+part
+partsupp
+region
+supplier
+</pre></div>
+<p>Now look at the table description:</p>
+
+<div class="source">
+<pre>tajo&gt; \d orders
+
+table name: orders
+table path: hdfs:/xxx/xxx/tpch/orders
+store type: CSV
+number of rows: 0
+volume (bytes): 172.0 MB
+schema: 
+o_orderkey      INT8
+o_custkey       INT8
+o_orderstatus   TEXT
+o_totalprice    FLOAT8
+o_orderdate     TEXT
+o_orderpriority TEXT
+o_clerk TEXT
+o_shippriority  INT4
+o_comment       TEXT
+</pre></div>
+<h1><a name="DataModel"></a>Data Model</h1></div>
+<div class="section">
+<h2><a name="DataTypes"></a>Data Types<a name="Data_Types"></a></h2>
+
+<table border="0" class="table table-striped">
+  <thead>
+    
+<tr class="a">
+      
+<th>Supported </th>
+      
+<th>SQL Type Name </th>
+      
+<th>Alias </th>
+      
+<th>Size (byte) </th>
+      
+<th>Description </th>
+      
+<th>Range </th>
+    </tr>
+  </thead>
+  <tbody>
+    
+<tr class="b">
+      
+<td>O </td>
+      
+<td>boolean </td>
+      
+<td>bool </td>
+      
+<td>1 </td>
+      
+<td> </td>
+      
+<td>true/false </td>
+    </tr>
+    
+<tr class="a">
+      
+<td> </td>
+      
+<td>bit </td>
+      
+<td> </td>
+      
+<td>1 </td>
+      
+<td> </td>
+      
+<td>1/0 </td>
+    </tr>
+    
+<tr class="b">
+      
+<td> </td>
+      
+<td>varbit </td>
+      
+<td>bit varying </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>O </td>
+      
+<td>smallint </td>
+      
+<td>tinyint, int2 </td>
+      
+<td>2 </td>
+      
+<td>small-range integer value </td>
+      
+<td>-2^15 (-32,768) to 2^15 (32,767) </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>O </td>
+      
+<td>integer </td>
+      
+<td>int, int4 </td>
+      
+<td>4 </td>
+      
+<td>integer value </td>
+      
+<td>-2^31 (-2,147,483,648) to 2^31 - 1 (2,147,483,647) </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>O </td>
+      
+<td>bigint </td>
+      
+<td>bit varying </td>
+      
+<td>8 </td>
+      
+<td>larger range integer value </td>
+      
+<td>-2^63 (-9,223,372,036,854,775,808) to 2^63-1 (9,223,372,036,854,775,807) </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>O </td>
+      
+<td>real </td>
+      
+<td>int8 </td>
+      
+<td>4 </td>
+      
+<td>variable-precision, inexact, real number value </td>
+      
+<td>-3.4028235E+38 to 3.4028235E+38 (6 decimal digits precision) </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>O </td>
+      
+<td>float[(n)] </td>
+      
+<td>float4 </td>
+      
+<td>4 or 8 </td>
+      
+<td>variable-precision, inexact, real number value </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>O </td>
+      
+<td>double </td>
+      
+<td>float8, double precision </td>
+      
+<td>8 </td>
+      
+<td>variable-precision, inexact, real number value </td>
+      
+<td>1 .7E&#x2013;308 to 1.7E+308 (15 decimal digits precision) </td>
+    </tr>
+    
+<tr class="a">
+      
+<td> </td>
+      
+<td>number </td>
+      
+<td>decimal </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="b">
+      
+<td> </td>
+      
+<td>char[(n)] </td>
+      
+<td>character </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="a">
+      
+<td> </td>
+      
+<td>varchar[(n)] </td>
+      
+<td>character varying </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>O </td>
+      
+<td>text </td>
+      
+<td>text </td>
+      
+<td> </td>
+      
+<td>variable-length unicode text </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="a">
+      
+<td> </td>
+      
+<td>binary </td>
+      
+<td>binary </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="b">
+      
+<td> </td>
+      
+<td>varbinary[(n)]</td>
+      
+<td>binary varying </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>O </td>
+      
+<td>blob </td>
+      
+<td>bytea </td>
+      
+<td> </td>
+      
+<td>variable-length binary string </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="b">
+      
+<td> </td>
+      
+<td>date </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="a">
+      
+<td> </td>
+      
+<td>time </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="b">
+      
+<td> </td>
+      
+<td>timetz </td>
+      
+<td>time with time zone </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="a">
+      
+<td> </td>
+      
+<td>timestamp </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="b">
+      
+<td> </td>
+      
+<td>timestamptz </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>O </td>
+      
+<td>inet4 </td>
+      
+<td> </td>
+      
+<td>4 </td>
+      
+<td>IPv4 address </td>
+      
+<td> </td>
+    </tr>
+  </tbody>
+</table>
+<div class="section">
+<h3><a name="UsingRealNumberValue"></a>Using real number value (real and double)<a name="Using_real_number_value_real_and_double"></a></h3>
+<p>The real and double data types are mapped to float and double of java primitives respectively. Java primitives float and double follows the IEEE 754 specification. So, these types are correctly matched to SQL standard data types.</p>
+
+<ul>
+  
+<li>float[( n )] is mapped to either float or double according to a given length n. If n is specified, it must be bewtween 1 and 53. The default value of n is 53.</li>
+  
+<li>If 1 &lt;= n &lt;= 24, a value is mapped to float (6 decimal digits precision).</li>
+  
+<li>If 25 &lt;= n &lt;= 53, a value is mapped to double (15 decimal digits precision).</li>
+  
+<li>
+<p>Do not use approximate real number columns in WHERE clause in order to compare some exact matches, especially the = and &lt;&gt; operators. The &gt; or &lt; comparisons work well.</p></li>
+</ul>
+<h1><a name="SQLLanguage"></a>The SQL Language</h1></div></div>
+<div class="section">
+<h2><a name="DDL"></a>Data Definition Language<a name="Data_Definition_Language"></a></h2>
+<div class="section">
+<h3><a name="CreateTable"></a>CREATE TABLE<a name="CREATE_TABLE"></a></h3>
+<p><i>Synopsis</i></p>
+
+<div class="source">
+<pre>CREATE TABLE &lt;table_name&gt; [(&lt;column_name&gt; &lt;data_type&gt;, ... )]
+  [using &lt;storage_type&gt; [with (&lt;key&gt; = &lt;value&gt;, ...)]] [AS &lt;select_statement&gt;]
+
+CREATE EXTERNAL TABLE
+
+CREATE EXTERNAL TABLE &lt;table_name&gt; (&lt;column_name&gt; &lt;data_type&gt;, ... )
+  using &lt;storage_type&gt; [with (&lt;key&gt; = &lt;value&gt;, ...)] LOCATION '&lt;path&gt;'
+</pre></div>
+<div class="section">
+<h4><a name="DDLCompression"></a>Compression<a name="Compression"></a></h4>
+<p>If you want to add an external table that contains compressed data, you should give &#x2018;compression.code&#x2019; parameter to CREATE TABLE statement.</p>
+
+<div class="source">
+<pre>create EXTERNAL table lineitem (
+  L_ORDERKEY bigint, 
+  L_PARTKEY bigint, 
+  ...
+  L_COMMENT text) 
+
+USING csv WITH ('csvfile.delimiter'='|','compression.codec'='org.apache.hadoop.io.compress.DeflateCodec')
+LOCATION 'hdfs://localhost:9010/tajo/warehouse/lineitem_100_snappy';
+</pre></div>
+<p>&#x2018;compression.codec&#x2019; parameter can have one of the following compression codecs:  * org.apache.hadoop.io.compress.BZip2Codec  * org.apache.hadoop.io.compress.DeflateCodec  * org.apache.hadoop.io.compress.GzipCodec  * org.apache.hadoop.io.compress.SnappyCodec </p></div></div>
+<div class="section">
+<h3><a name="DropTable"></a>DROP TABLE<a name="DROP_TABLE"></a></h3>
+
+<div class="source">
+<pre>DROP TABLE &lt;table_name&gt;
+</pre></div></div></div>
+<div class="section">
+<h2><a name="DML"></a>Data Manipulation Language (DML)<a name="Data_Manipulation_Language_DML"></a></h2>
+<div class="section">
+<h3><a name="SQLExpressions"></a>SQL Expressions<a name="SQL_Expressions"></a></h3>
+<div class="section">
+<h4><a name="ArithmeticExpressions"></a>Arithmetic Expressions<a name="Arithmetic_Expressions"></a></h4>
+<div class="section">
+<h5><a name="TypeCasts"></a>Type Casts<a name="Type_Casts"></a></h5>
+<p>A type cast converts a specified-typed data to another-typed data. Tajo has two type cast syntax:</p>
+
+<div class="source">
+<pre>CAST ( expression AS type )
+expression::type
+</pre></div></div>
+<div class="section">
+<h5><a name="StringExpressions"></a>String Expressions<a name="String_Expressions"></a></h5>
+<p>(TODO)</p></div>
+<div class="section">
+<h5><a name="FunctionCall"></a>Function Call<a name="Function_Call"></a></h5>
+
+<div class="source">
+<pre>function_name ([expression [, expression ... ]] )
+</pre></div></div></div></div>
+<div class="section">
+<h3><a name="Select"></a>SELECT<a name="SELECT"></a></h3>
+<p><i>Synopsis</i></p>
+
+<div class="source">
+<pre>SELECT [distinct [all]] * | &lt;expression&gt; [[AS] &lt;alias&gt;] [, ...]
+  [FROM &lt;table name&gt; [[AS] &lt;table alias name&gt;] [, ...]]
+  [WHERE &lt;condition&gt;]
+  [GROUP BY &lt;expression&gt; [, ...]]
+  [HAVING &lt;condition&gt;]
+  [ORDER BY &lt;expression&gt; [ASC|DESC] [NULL FIRST|NULL LAST] [, ...]]
+</pre></div></div>
+<div class="section">
+<h3><a name="Where"></a>WHERE<a name="WHERE"></a></h3>
+<div class="section">
+<h4><a name="InPredicate"></a>IN Predicate<a name="IN_Predicate"></a></h4>
+<p>IN predicate provides row and array comparison.</p>
+<p><i>Synopsis</i></p>
+
+<div class="source">
+<pre>column_reference IN (val1, val2, ..., valN)
+column_reference NOT IN (val1, val2, ..., valN)
+</pre></div>
+<p>Examples are as follows:</p>
+
+<div class="source">
+<pre>-- this statement filters lists down all the records where col1 value is 1, 2 or 3:
+SELECT col1, col2 FROM table1 WHERE col1 IN (1, 2, 3);
+
+-- this statement filters lists down all the records where col1 value is neither 1, 2 nor 3:
+SELECT col1, col2 FROM table1 WHERE col1 NOT IN (1, 2, 3);
+</pre></div>
+<p>You can use &#x2018;IN clause&#x2019; on text data domain as follows:</p>
+
+<div class="source">
+<pre>SELECT col1, col2 FROM table1 WHERE col2 IN ('tajo', 'hadoop');
+
+SELECT col1, col2 FROM table1 WHERE col2 NOT IN ('tajo', 'hadoop');
+</pre></div></div>
+<div class="section">
+<h4><a name="StringPatternMatching"></a>String Pattern Matching Predicates<a name="String_Pattern_Matching_Predicates"></a></h4>
+<div class="section">
+<h5><a name="LikePredicate"></a>LIKE<a name="LIKE"></a></h5>
+<p>LIKE operator returns true or false depending on whether its pattern matches the given string. An underscore (_) in pattern matches any single character. A percent sign (%) matches any sequence of zero or more characters.</p>
+<p><i>Synopsis</i></p>
+
+<div class="source">
+<pre>string LIKE pattern
+string NOT LIKE pattern
+</pre></div></div>
+<div class="section">
+<h5><a name="ILikePredicate"></a>ILIKE<a name="ILIKE"></a></h5>
+<p>ILIKE is the same to LIKE, but it is a case insensitive operator. It is not in the SQL standard. We borrow this operator from PostgreSQL.</p>
+<p><i>Synopsis</i></p>
+
+<div class="source">
+<pre>string ILIKE pattern
+string NOT ILIKE pattern
+</pre></div></div>
+<div class="section">
+<h5><a name="SimilarToPredicate"></a>SIMILAR TO<a name="SIMILAR_TO"></a></h5>
+<p><i>Synopsis</i></p>
+
+<div class="source">
+<pre>string SIMILAR TO pattern
+string NOT SIMILAR TO pattern
+</pre></div>
+<p>It returns true or false depending on whether its pattern matches the given string. Also like LIKE, &#x2018;SIMILAR TO&#x2019; uses &#x2018;_&#x2019; and &#x2018;%&#x2019; as metacharacters denoting any single character and any string, respectively.</p>
+<p>In addition to these metacharacters borrowed from LIKE, &#x2018;SIMILAR TO&#x2019; supports more powerful pattern-matching metacharacters borrowed from regular expressions:</p>
+
+<table border="0" class="table table-striped">
+  <thead>
+    
+<tr class="a">
+      
+<th>metacharacter </th>
+      
+<th>description </th>
+    </tr>
+  </thead>
+  <tbody>
+    
+<tr class="b">
+      
+<td>| </td>
+      
+<td>denotes alternation (either of two alternatives). </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>* </td>
+      
+<td>denotes repetition of the previous item zero or more times. </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>+ </td>
+      
+<td>denotes repetition of the previous item one or more times. </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>? </td>
+      
+<td>denotes repetition of the previous item zero or one time. </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>{m} </td>
+      
+<td>denotes repetition of the previous item exactly m times. </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>{m,} </td>
+      
+<td>denotes repetition of the previous item m or more times. </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>{m,n} </td>
+      
+<td>denotes repetition of the previous item at least m and not more than n times. </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>[] </td>
+      
+<td>A bracket expression specifies a character class, just as in POSIX regular expressions. </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>() </td>
+      
+<td>Parentheses can be used to group items into a single logical item. </td>
+    </tr>
+  </tbody>
+</table>
+<p>Note that &#x2018;.&#x2019; is not used as a metacharacter in &#x2018;SIMILAR TO&#x2019; operator.</p></div>
+<div class="section">
+<h5><a name="RegularExpressions"></a>Regular expressions<a name="Regular_expressions"></a></h5>
+<p>Regular expressions provide a very powerful means for string pattern matching. In the current Tajo, regular expressions are based on Java-style regular expressions instead of POSIX regular expression. The main difference between java-style one and POSIX&#x2019;s one is character class.</p>
+<p><i>Synopsis</i></p>
+
+<div class="source">
+<pre>string ~ pattern
+string !~ pattern
+
+string ~* pattern
+string !~* pattern
+</pre></div>
+
+<table border="0" class="table table-striped">
+  <thead>
+    
+<tr class="a">
+      
+<th>operator </th>
+      
+<th>Description </th>
+    </tr>
+  </thead>
+  <tbody>
+    
+<tr class="b">
+      
+<td>~ </td>
+      
+<td>It returns true if a given regular expression is matched to string. Otherwise, it returns false. </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>!~ </td>
+      
+<td>It returns false if a given regular expression is matched to string. Otherwise, it returns true. </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>~* </td>
+      
+<td>It is the same to &#x2018;~&#x2019;, but it is case insensitive. </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>!~* </td>
+      
+<td>It is the same to &#x2018;!~&#x2019;, but it is case insensitive. </td>
+    </tr>
+  </tbody>
+</table>
+<p>Here are examples:</p>
+
+<div class="source">
+<pre>'abc'   ~   '.*c'               true
+'abc'   ~   'c'                 false
+'aaabc' ~   '([a-z]){3}bc       true
+'abc'   ~*  '.*C'               true
+'abc'   !~* 'B.*'               true
+</pre></div>
+<p>Regular expressions operator is not in the SQL standard. We borrow this operator from PostgreSQL.</p>
+<p><i>Synopsis for REGEXP and RLIKE operators</i></p>
+
+<div class="source">
+<pre>string REGEXP pattern
+string NOT REGEXP pattern
+
+string RLIKE pattern
+string NOT RLIKE pattern
+</pre></div>
+<p>But, they do not support case-insensitive operators.</p></div></div></div>
+<div class="section">
+<h3><a name="InsertOverwrite"></a>INSERT (OVERWRITE) INTO<a name="INSERT_OVERWRITE_INTO"></a></h3>
+<p>INSERT OVERWRITE statement overwrites a table data of an existing table or a data in a given directory. Tajo&#x2019;s INSERT OVERWRITE statement follows &#x2018;INSERT INTO SELECT&#x2019; statement of SQL. The examples are as follows:</p>
+
+<div class="source">
+<pre>create table t1 (col1 int8, col2 int4, col3 float4);
+
+-- when a target table schema and output schema are equivalent to each other
+INSERT OVERWRITE INTO t1 SELECT l_orderkey, l_partkey, l_quantity FROM lineitem;
+-- or
+INSERT OVERWRITE INTO t1 SELECT * FROM lineitem;
+
+-- when the output schema are smaller than the target table schema
+INSERT OVERWRITE INTO t1 SELECT l_orderkey FROM lineitem;
+
+-- when you want to specify certain target columns
+INSERT OVERWRITE INTO t1 (col1, col3) SELECT l_orderkey, l_quantity FROM lineitem;
+</pre></div>
+<p>In addition, INSERT OVERWRITE statement overwrites table data as well as a specific directory.</p>
+
+<div class="source">
+<pre>INSERT OVERWRITE INTO LOCATION '/dir/subdir' SELECT l_orderkey, l_quantity FROM lineitem;
+</pre></div></div></div>
+<div class="section">
+<h2><a name="Functions"></a>Functions</h2>
+<div class="section">
+<h3><a name="StandardFunctions"></a>Standard Functions<a name="Standard_Functions"></a></h3>
+
+<table border="0" class="table table-striped">
+  <thead>
+    
+<tr class="a">
+      
+<th>function definition </th>
+      
+<th>return type </th>
+      
+<th>description </th>
+      
+<th>example </th>
+      
+<th>result </th>
+    </tr>
+  </thead>
+  <tbody>
+    
+<tr class="b">
+      
+<td>count(*) </td>
+      
+<td>int8 </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>count(expr) </td>
+      
+<td>int8 </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>avg(expr) </td>
+      
+<td>depending on expr </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>sum(expr) </td>
+      
+<td>depending on expr </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>min(expr) </td>
+      
+<td>depending on expr </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>max(expr) </td>
+      
+<td>depending on expr </td>
+      
+<td> </td>
+      
+<td> </td>
+      
+<td> </td>
+    </tr>
+  </tbody>
+</table></div>
+<div class="section">
+<h3><a name="StringFunctions"></a>String Operator and Functions<a name="String_Operator_and_Functions"></a></h3>
+
+<table border="0" class="table table-striped">
+  <thead>
+    
+<tr class="a">
+      
+<th>function definition </th>
+      
+<th>return type </th>
+      
+<th>description </th>
+      
+<th>example </th>
+      
+<th>result </th>
+    </tr>
+  </thead>
+  <tbody>
+    
+<tr class="b">
+      
+<td>string || string </td>
+      
+<td>text </td>
+      
+<td>string concatenate </td>
+      
+<td>&#x2018;Ta&#x2019; || &#x2018;jo&#x2019; </td>
+      
+<td>Tajo </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>char_length(string text) or character_length(string text) </td>
+      
+<td>int </td>
+      
+<td>Number of characters in string </td>
+      
+<td>char_length(&#x2018;Tajo&#x2019;) </td>
+      
+<td>4 </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>trim([leading | trailing | both] [characters] from string)</td>
+      
+<td>text </td>
+      
+<td>Remove the characters (a space by default) from the start/end/both ends of the string</td>
+      
+<td>trim(both &#x2018;x&#x2019; from &#x2018;xTajoxx&#x2019;) </td>
+      
+<td>Tajo </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>btrim(string text [, characters text]) </td>
+      
+<td>text </td>
+      
+<td>Remove the characters (a space by default) from the both ends of the string </td>
+      
+<td>trim(&#x2018;xTajoxx&#x2019;, &#x2018;x&#x2019;) </td>
+      
+<td>Tajo </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>ltrim(string text [, characters text]) </td>
+      
+<td>text </td>
+      
+<td>Remove the characters (a space by default) from the start ends of the string </td>
+      
+<td>ltrim(&#x2018;xxTajo&#x2019;, &#x2018;x&#x2019;) </td>
+      
+<td>Tajo </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>rtrim(string text [, characters text]) </td>
+      
+<td>text </td>
+      
+<td>Remove the characters (a space by default) from the end ends of the string </td>
+      
+<td>rtrim(&#x2018;Tajoxx&#x2019;, &#x2018;x&#x2019;) </td>
+      
+<td>Tajo </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>split_part(string text, delimiter text, field int) </td>
+      
+<td>text </td>
+      
+<td>Split a string on delimiter and return the given field (counting from one) </td>
+      
+<td>split_part(&#x2018;ab_bc_cd&#x2019;,&#x2018;_&#x2019;,2) </td>
+      
+<td>bc </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>regexp_replace(string text, pattern text, replacement text) </td>
+      
+<td>text </td>
+      
+<td>Replace substrings matched to a given regular expression pattern </td>
+      
+<td>regexp_replace(&#x2018;abcdef&#x2019;, &#x2018;(&#x2c6;ab|ef$)&#x2019;, &#x2018;&#x2013;&#x2019;) </td>
+      
+<td>&#x2013;cd&#x2013; </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>upper(string text) </td>
+      
+<td>text </td>
+      
+<td>makes an input text to be upper case </td>
+      
+<td>upper(&#x2018;tajo&#x2019;) </td>
+      
+<td>TAJO </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>lower(string text) </td>
+      
+<td>text </td>
+      
+<td>makes an input text to be lower case </td>
+      
+<td>lower(&#x2018;TAJO&#x2019;) </td>
+      
+<td>tajo </td>
+    </tr>
+  </tbody>
+</table>
+<h1><a name="Administration"></a>Administration</h1></div></div>
+<div class="section">
+<h2><a name="CatalogBackup"></a>Catalog Backup and Restore<a name="Catalog_Backup_and_Restore"></a></h2>
+<p>Now, Tajo supports a two backup methods for </p>
+
+<ul>
+  
+<li>SQL dump</li>
+  
+<li>Database-level backup</li>
+</ul>
+<div class="section">
+<h3><a name="SQLDump"></a>SQL dump<a name="SQL_dump"></a></h3>
+<p>SQL dump is an easy and strong way. If you use this approach, you don&#x2019;t need to concern database-level compatiblities. If you want to backup your catalog, just use bin/tajo_dump command. The basic usage of this command is:</p>
+
+<div class="source">
+<pre>$ tajo_dump table_name &gt; outfile
+</pre></div>
+<p>For example, if you want to backup a table customer, you should type a command as follows:</p>
+
+<div class="source">
+<pre>$ bin/tajo_dump customer &gt; table_backup.sql
+$
+$ cat table_backup.sql
+-- Tajo database dump
+-- Dump date: 10/04/2013 16:28:03
+--
+
+--
+-- Name: customer; Type: TABLE; Storage: CSV
+-- Path: file:/home/hyunsik/tpch/customer
+--
+CREATE EXTERNAL TABLE customer (c_custkey INT8, c_name TEXT, c_address TEXT, c_nationkey INT8, c_phone TEXT, c_acctbal FLOAT8, c_mktsegment TEXT, c_comment TEXT) USING CSV LOCATION 'file:/home/hyunsik/tpch/customer';
+</pre></div>
+<p>If you want to restore the catalog from the SQL dump file, please type the below command:</p>
+
+<div class="source">
+<pre>$ bin/tsql -f table_backup.sql
+</pre></div>
+<p>If you use an option &#x2018;-a&#x2019;, tajo_dump will dump all table DDLs.</p>
+
+<div class="source">
+<pre>$ bin/tajo_dump -a &gt; all_backup.sql
+</pre></div></div>
+<div class="section">
+<h3><a name="DatabaseLevelBackup"></a>Database-level backup<a name="Database-level_backup"></a></h3>
+<p>(TODO)</p></div></div>
+                  </div>
+            </div>
+          </div>
+
+    <hr/>
+
+    <footer>
+            <div class="container-fluid">
+              <div class="row span12">Copyright &copy;                    2013
+                        <a href="http://www.apache.org">Apache Software Foundation</a>.
+            All Rights Reserved.      
+                    
+      </div>
+
+                                                                  <?xml version="1.0" encoding="UTF-8"?>
+<div class="row span12">Apache Tajo, Apache Hadoop, Apache, the Apache feather logo, and the Apache incubator logo are
+        trademarks of The Apache Software Foundation. All other marks mentioned may be trademarks
+        or registered trademarks of their respective owners.</div>
+                  
+        
+                </div>
+    </footer>
+  </body>
+</html>