You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tajo.apache.org by ji...@apache.org on 2015/01/10 08:27:47 UTC

tajo git commit: TAJO-1294: Add index documents. (jihoon)

Repository: tajo
Updated Branches:
  refs/heads/master 9eac34fe3 -> 3308aab4b


TAJO-1294: Add index documents. (jihoon)


Project: http://git-wip-us.apache.org/repos/asf/tajo/repo
Commit: http://git-wip-us.apache.org/repos/asf/tajo/commit/3308aab4
Tree: http://git-wip-us.apache.org/repos/asf/tajo/tree/3308aab4
Diff: http://git-wip-us.apache.org/repos/asf/tajo/diff/3308aab4

Branch: refs/heads/master
Commit: 3308aab4b64d36b805b0b0e6cc282a6dce97f6ad
Parents: 9eac34f
Author: Jihoon Son <ji...@apache.org>
Authored: Sat Jan 10 16:27:23 2015 +0900
Committer: Jihoon Son <ji...@apache.org>
Committed: Sat Jan 10 16:27:23 2015 +0900

----------------------------------------------------------------------
 CHANGES                                         |  2 +
 tajo-docs/src/main/sphinx/index.rst             |  1 +
 tajo-docs/src/main/sphinx/index/future_work.rst |  8 +++
 tajo-docs/src/main/sphinx/index/how_to_use.rst  | 69 ++++++++++++++++++++
 tajo-docs/src/main/sphinx/index/types.rst       |  7 ++
 tajo-docs/src/main/sphinx/index_overview.rst    | 20 ++++++
 tajo-docs/src/main/sphinx/sql_language/ddl.rst  | 33 +++++++++-
 7 files changed, 139 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/tajo/blob/3308aab4/CHANGES
----------------------------------------------------------------------
diff --git a/CHANGES b/CHANGES
index b5578e7..8f32d54 100644
--- a/CHANGES
+++ b/CHANGES
@@ -282,6 +282,8 @@ Release 0.9.1 - unreleased
 
   TASKS
 
+    TAJO-1294: Add index documents. (jihoon)
+
     TAJO-1280: Update the roles of Hyoungjun and Jihun in web site.
     (hyunsik)
 

http://git-wip-us.apache.org/repos/asf/tajo/blob/3308aab4/tajo-docs/src/main/sphinx/index.rst
----------------------------------------------------------------------
diff --git a/tajo-docs/src/main/sphinx/index.rst b/tajo-docs/src/main/sphinx/index.rst
index 80cd842..667f270 100644
--- a/tajo-docs/src/main/sphinx/index.rst
+++ b/tajo-docs/src/main/sphinx/index.rst
@@ -37,6 +37,7 @@ Table of Contents:
    functions
    table_management
    table_partitioning
+   index_overview
    backup_and_restore
    hcatalog_integration
    jdbc_driver   

http://git-wip-us.apache.org/repos/asf/tajo/blob/3308aab4/tajo-docs/src/main/sphinx/index/future_work.rst
----------------------------------------------------------------------
diff --git a/tajo-docs/src/main/sphinx/index/future_work.rst b/tajo-docs/src/main/sphinx/index/future_work.rst
new file mode 100644
index 0000000..c6ec47d
--- /dev/null
+++ b/tajo-docs/src/main/sphinx/index/future_work.rst
@@ -0,0 +1,8 @@
+*************************************
+Future Works
+*************************************
+
+* Providing more index types, such as bitmap and HBase index
+* Supporting index on partitioned tables
+* Supporting the backup and restore feature
+* Cost-based query optimization by estimating the query selectivity
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/tajo/blob/3308aab4/tajo-docs/src/main/sphinx/index/how_to_use.rst
----------------------------------------------------------------------
diff --git a/tajo-docs/src/main/sphinx/index/how_to_use.rst b/tajo-docs/src/main/sphinx/index/how_to_use.rst
new file mode 100644
index 0000000..776d205
--- /dev/null
+++ b/tajo-docs/src/main/sphinx/index/how_to_use.rst
@@ -0,0 +1,69 @@
+*************************************
+How to use index?
+*************************************
+
+-------------------------------------
+1. Create index
+-------------------------------------
+
+The first step for utilizing index is index creation. You can create index using SQL (:doc:`/sql_language/ddl`) or Tajo API (:doc:`/tajo_client_api`). For example, you can create a BST index on the lineitem table by submitting the following SQL to Tajo.
+
+.. code-block:: sql
+
+     default> create index l_orderkey_idx on lineitem (l_orderkey);
+
+If the index is created successfully, you can see the information about that index as follows: ::
+
+  default> \d lineitem
+
+  table name: default.lineitem
+  table path: hdfs://localhost:7020/tpch/lineitem
+  store type: CSV
+  number of rows: unknown
+  volume: 753.9 MB
+  Options:
+  	'text.delimiter'='|'
+
+  schema:
+  l_orderkey	INT8
+  l_partkey	INT8
+  l_suppkey	INT8
+  l_linenumber	INT8
+  l_quantity	FLOAT4
+  l_extendedprice	FLOAT4
+  l_discount	FLOAT4
+  l_tax	FLOAT4
+  l_returnflag	TEXT
+  l_linestatus	TEXT
+  l_shipdate	DATE
+  l_commitdate	DATE
+  l_receiptdate	DATE
+  l_shipinstruct	TEXT
+  l_shipmode	TEXT
+  l_comment	TEXT
+
+
+  Indexes:
+  "l_orderkey_idx" TWO_LEVEL_BIN_TREE (l_orderkey ASC NULLS LAST )
+
+For more information about index creation, please refer to the above links.
+
+-------------------------------------
+2. Enable/disable index scans
+-------------------------------------
+
+When an index is successfully created, you must enable the index scan feature as follows:
+
+.. code-block:: sql
+
+     default> \set INDEX_ENABLED true
+
+If you don't want to use the index scan feature anymore, you can simply disable it as follows:
+
+.. code-block:: sql
+
+     default> \set INDEX_ENABLED false
+
+.. note::
+
+     Once the index scan feature is enabled, Tajo currently always performs the index scan regardless of its efficiency. You should set this option when the expected number of retrieved tuples is sufficiently small.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/tajo/blob/3308aab4/tajo-docs/src/main/sphinx/index/types.rst
----------------------------------------------------------------------
diff --git a/tajo-docs/src/main/sphinx/index/types.rst b/tajo-docs/src/main/sphinx/index/types.rst
new file mode 100644
index 0000000..457f453
--- /dev/null
+++ b/tajo-docs/src/main/sphinx/index/types.rst
@@ -0,0 +1,7 @@
+*************************************
+Index Types
+*************************************
+
+Currently, Tajo supports only one type of index, ``TWO_LEVEL_BIN_TREE``, shortly ``BST``. The BST index is a kind of binary search tree which is extended to be permanently stored on disk. It consists of two levels of nodes; a leaf node indexes the keys with the positions of data in an HDFS block and a root node indexes the keys with the leaf node indices.
+
+When an index scan is started, the query engine first reads the root node and finds the search key. If it finds a leaf node corresponding to the search key, it subsequently finds the search key in that leaf node. Finally, it directly reads a tuple corresponding to the search key from HDFS.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/tajo/blob/3308aab4/tajo-docs/src/main/sphinx/index_overview.rst
----------------------------------------------------------------------
diff --git a/tajo-docs/src/main/sphinx/index_overview.rst b/tajo-docs/src/main/sphinx/index_overview.rst
new file mode 100644
index 0000000..a42931b
--- /dev/null
+++ b/tajo-docs/src/main/sphinx/index_overview.rst
@@ -0,0 +1,20 @@
+*****************************
+Index (Experimental Feature)
+*****************************
+
+An index is a data structure that is used for efficient query processing. Using an index, the Tajo query engine can directly retrieve search values.
+
+This is still an experimental feature. In order to use indexes, you must check out the source code of the ``index_support`` branch::
+
+  git clone -b index_support https://git-wip-us.apache.org/repos/asf/tajo.git tajo-index
+
+For the source code build, please refer to :doc:`getting_started`.
+
+The following sections describe the supported index types, the query execution with an index, and the future works.
+
+.. toctree::
+      :maxdepth: 1
+
+      index/types
+      index/how_to_use
+      index/future_work
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/tajo/blob/3308aab4/tajo-docs/src/main/sphinx/sql_language/ddl.rst
----------------------------------------------------------------------
diff --git a/tajo-docs/src/main/sphinx/sql_language/ddl.rst b/tajo-docs/src/main/sphinx/sql_language/ddl.rst
index 3fba6be..60b7190 100644
--- a/tajo-docs/src/main/sphinx/sql_language/ddl.rst
+++ b/tajo-docs/src/main/sphinx/sql_language/ddl.rst
@@ -75,4 +75,35 @@ If you want to add an external table that contains compressed data, you should g
 
   DROP TABLE [IF EXISTS] <table_name> [PURGE]
 
-``IF EXISTS`` allows ``DROP DATABASE`` statement to avoid an error which occurs when the database does not exist. ``DROP TABLE`` statement removes a table from Tajo catalog, but it does not remove the contents. If ``PURGE`` option is given, ``DROP TABLE`` statement will eliminate the entry in the catalog as well as the contents.
\ No newline at end of file
+``IF EXISTS`` allows ``DROP DATABASE`` statement to avoid an error which occurs when the database does not exist. ``DROP TABLE`` statement removes a table from Tajo catalog, but it does not remove the contents. If ``PURGE`` option is given, ``DROP TABLE`` statement will eliminate the entry in the catalog as well as the contents.
+
+========================
+ CREATE INDEX
+========================
+
+*Synopsis*
+
+.. code-block:: sql
+
+  CREATE INDEX [ name ] ON table_name [ USING method ]
+  ( { column_name | ( expression ) } [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
+  [ WHERE predicate ]
+
+------------------------
+ Index method
+------------------------
+
+Currently, Tajo supports only one type of index.
+
+Index methods:
+  * TWO_LEVEL_BIN_TREE: This method is used by default in Tajo. For more information about its structure, please refer to :doc:`/index/types`.
+
+========================
+ DROP INDEX
+========================
+
+*Synopsis*
+
+.. code-block:: sql
+
+  DROP INDEX name
\ No newline at end of file