You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tajo.apache.org by hy...@apache.org on 2014/03/07 04:14:27 UTC

git commit: TAJO-659: Add Tajo JDBC documentation.

Repository: incubator-tajo
Updated Branches:
  refs/heads/master 399c600c7 -> e3da0cafd


TAJO-659: Add Tajo JDBC documentation.


Project: http://git-wip-us.apache.org/repos/asf/incubator-tajo/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-tajo/commit/e3da0caf
Tree: http://git-wip-us.apache.org/repos/asf/incubator-tajo/tree/e3da0caf
Diff: http://git-wip-us.apache.org/repos/asf/incubator-tajo/diff/e3da0caf

Branch: refs/heads/master
Commit: e3da0cafd4e94e024210315545c78d86f3ad66bc
Parents: 399c600
Author: Hyunsik Choi <hy...@apache.org>
Authored: Fri Mar 7 12:14:02 2014 +0900
Committer: Hyunsik Choi <hy...@apache.org>
Committed: Fri Mar 7 12:14:02 2014 +0900

----------------------------------------------------------------------
 CHANGES.txt                                     |   2 +
 .../configuration/configuration_defaults.rst    |   1 -
 .../src/main/sphinx/hcatalog_integration.rst    |  36 ++++-
 tajo-docs/src/main/sphinx/jdbc_driver.rst       | 132 ++++++++++++++++++-
 4 files changed, 168 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-tajo/blob/e3da0caf/CHANGES.txt
----------------------------------------------------------------------
diff --git a/CHANGES.txt b/CHANGES.txt
index b8c22ef..07048dd 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -521,6 +521,8 @@ Release 0.8.0 - unreleased
 
   TASKS
 
+    TAJO-659: Add Tajo JDBC documentation. (hyunsik)
+
     TAJO-642: Change tajo documentation tool to sphinx. (hyunsik)
 
     TAJO-632: add intellij idea projects files into git ignore.

http://git-wip-us.apache.org/repos/asf/incubator-tajo/blob/e3da0caf/tajo-docs/src/main/sphinx/configuration/configuration_defaults.rst
----------------------------------------------------------------------
diff --git a/tajo-docs/src/main/sphinx/configuration/configuration_defaults.rst b/tajo-docs/src/main/sphinx/configuration/configuration_defaults.rst
index d1e3add..5fcbe67 100644
--- a/tajo-docs/src/main/sphinx/configuration/configuration_defaults.rst
+++ b/tajo-docs/src/main/sphinx/configuration/configuration_defaults.rst
@@ -2,7 +2,6 @@
 Configuration Defaults
 **********************
 
-====================================
 Tajo Master Configuration Defaults
 ====================================
 

http://git-wip-us.apache.org/repos/asf/incubator-tajo/blob/e3da0caf/tajo-docs/src/main/sphinx/hcatalog_integration.rst
----------------------------------------------------------------------
diff --git a/tajo-docs/src/main/sphinx/hcatalog_integration.rst b/tajo-docs/src/main/sphinx/hcatalog_integration.rst
index 552ff82..7337346 100644
--- a/tajo-docs/src/main/sphinx/hcatalog_integration.rst
+++ b/tajo-docs/src/main/sphinx/hcatalog_integration.rst
@@ -2,4 +2,38 @@
 HCatalog Integration
 *************************************
 
-.. todo::
\ No newline at end of file
+Apache Tajo™ catalog supports HCatalogStore driver to integrate with Apache Hive™. 
+This integration allows Tajo to access all tables used in Apache Hive. 
+Depending on your purpose, you can execute either SQL queries or HiveQL queries on the 
+same tables managed in Apache Hive.
+
+In order to use this feature, you need to build Tajo with a specified maven profile 
+and then add some configs into ``conf/tajo-env.sh`` and ``conf/catalog-site.xml``. 
+This section describes how to setup HCatalog integration. 
+This instruction would take no more than ten minutes.
+
+First, you need to compile the source code with hcatalog profile. 
+Currently, Tajo supports hcatalog-0.11.0 and hcatalog-0.12.0 profile.
+So, if you want to use Hive 0.11.0, you need to set ``-Phcatalog-0.11.0`` as the maven profile ::
+
+  $ mvn clean package -DskipTests -Pdist -Dtar -Phcatalog-0.11.0
+
+Or, if you want to use Hive 0.12.0, you need to set ``-Phcatalog-0.12.0`` as the maven profile ::
+
+  $ mvn clean package -DskipTests -Pdist -Dtar -Phcatalog-0.12.0
+
+Then, you need to set your Hive home directory to the environment variable ``HIVE_HOME`` in conf/tajo-env.sh as follows: ::
+
+  export HIVE_HOME=/path/to/your/hive/directory
+
+If you need to use jdbc to connect HiveMetaStore, you have to prepare MySQL jdbc driver.
+Next, you should set the path of MySQL JDBC driver jar file to the environment variable HIVE_JDBC_DRIVER_DIR in conf/tajo-env.sh as follows: ::
+
+  export HIVE_JDBC_DRIVER_DIR==/path/to/your/mysql_jdbc_driver/mysql-connector-java-x.x.x-bin.jar
+
+Finally, you should specify HCatalogStore as Tajo catalog driver class in ``conf/catalog-site.xml`` as follows: ::
+
+  <property>
+    <name>tajo.catalog.store.class</name>
+    <value>org.apache.tajo.catalog.store.HCatalogStore</value>
+  </property>

http://git-wip-us.apache.org/repos/asf/incubator-tajo/blob/e3da0caf/tajo-docs/src/main/sphinx/jdbc_driver.rst
----------------------------------------------------------------------
diff --git a/tajo-docs/src/main/sphinx/jdbc_driver.rst b/tajo-docs/src/main/sphinx/jdbc_driver.rst
index 153fbff..306cdc1 100644
--- a/tajo-docs/src/main/sphinx/jdbc_driver.rst
+++ b/tajo-docs/src/main/sphinx/jdbc_driver.rst
@@ -2,8 +2,138 @@
 Tajo JDBC Driver
 *************************************
 
+Apache Tajo™ provides JDBC driver
+which enables Java applciations to easily access Apache Tajo in a RDBMS-like manner.
+In this section, we explain how to get JDBC driver and an example code.
+
 How to get JDBC driver
 =======================
 
+Tajo provides some necesssary jar files packaged by maven. In order get the jar files, 
+please follow the below commands.
+
+.. code-block:: bash
+
+  $ cd tajo-x.y.z-incubating
+  $ mvn clean package -DskipTests -Pdist -Dtar
+  $ ls -l tajo-dist/target/tajo-x.y.z-incubating/share/jdbc-dist
+
+
+Setting the CLASSPATH
+=======================
+
+In order to use the JDBC driver, you should set the jar files included in 
+``tajo-dist/target/tajo-x.y.z-incubating/share/jdbc-dist`` to your ``CLASSPATH``.
+In addition, you should add hadoop clsspath into your ``CLASSPATH``.
+So, ``CLASSPATH`` will be set as follows:
+
+.. code-block:: bash
+
+  CLASSPATH=path/to/tajo-jdbc/*:${TAJO_HOME}/conf:$(hadoop classpath)
+
+.. note::
+
+  You can get ${hadoop classpath} by executing  the command ``bin/hadoop classpath`` in your hadoop cluster.
+
+.. note::
+
+  You may want to a minimal set of JAR files. If so, please refer :ref:`minimal_jar_files`.
+
+An Example JDBC Client
+=======================
+
+The JDBC driver class name is ``org.apache.tajo.jdbc.TajoDriver``.
+You can get the driver ``Class.forName("org.apache.tajo.jdbc.TajoDriver").newInstance()``.
+The connection url should be ``jdbc:tajo://<TajoMaster hostname>:<TajoMaster client rpc port>``.
+The default TajoMaster client rpc port is ``26002``.
+If you want to change the listening port, please refer :doc:`/configuration/configuration_defaults`.
+
+.. note::
+  
+  Currently, Tajo does not support the concept of database and namespace. 
+  All tables are contained in ``default`` database. So, you don't need to specify any database name.
+
+The following shows an example of JDBC Client.
+
+.. code-block:: java
+
+  import java.sql.Connection;
+  import java.sql.ResultSet;
+  import java.sql.Statement;
+  import java.sql.DriverManager;
+
+  public class TajoJDBCClient {
+    
+    ....
+
+    public static void main(String[] args) throws Exception {
+      Class.forName("org.apache.tajo.jdbc.TajoDriver").newInstance();
+      Connection conn = DriverManager.getConnection("jdbc:tajo://127.0.0.1:26002");
+
+      Statement stmt = null;
+      ResultSet rs = null;
+      try {
+        stmt = conn.createStatement();
+        rs = stmt.executeQuery("select * from table1");
+        while (rs.next()) {
+          System.out.println(rs.getString(1) + "," + rs.getString(3));
+        }
+      } finally {
+        if (rs != null) rs.close();
+        if (stmt != null) stmt.close();
+        if (conn != null) conn.close();
+      }
+    }
+  }
+
+
+Appendix
+===========================================
+
+.. _minimal_jar_files:
+
+Minimal JAR file list
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The following JAR files are necessary minimal JAR file list.
+We've tested JDBC drivers with the following JAR files for
+usual SQL queries. But, they does not guarantee that they are 
+fully tested for all operations. So, you may need additional JAR files.
+In addition to the following JAR files, please don't forgot including
+``${HADOOP_HOME}/eta/hadoop`` and ``${TAJO_HOME}/conf`` in your ``CLASSPATH``.
+
+  * hadoop-annotations-2.2.0.jar
+  * hadoop-auth-2.2.0.jar
+  * hadoop-common-2.2.0.jar
+  * hadoop-hdfs-2.2.0.jar
+  * joda-time-2.3.jar
+  * tajo-catalog-common-0.8.0-SNAPSHOT.jar
+  * tajo-client-0.8.0-SNAPSHOT.jar
+  * tajo-common-0.8.0-SNAPSHOT.jar
+  * tajo-jdbc-0.8.0-SNAPSHOT.jar
+  * tajo-rpc-0.8.0-SNAPSHOT.jar
+  * tajo-storage-0.8.0-SNAPSHOT.jar
+  * log4j-1.2.17.jar
+  * commons-logging-1.1.1.jar
+  * guava-11.0.2.jar
+  * protobuf-java-2.5.0.jar
+  * netty-3.6.2.Final.jar
+  * commons-lang-2.5.jar
+  * commons-configuration-1.6.jar
+  * slf4j-api-1.7.5.jar
+  * slf4j-log4j12-1.7.5.jar
+  * commons-cli-1.2.jar
+  * commons-io-2.1.jar"
+
+
+FAQ
+===========================================
+
+java.nio.channels.UnresolvedAddressException
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+When retriving the final result, Tajo JDBC Driver tries to access HDFS data nodes.
+So, the network access between JDBC client and HDFS data nodes must be available.
+In many cases, a HDFS cluster is built in a private network which use private hostnames.
+So, the host names must be shared with the JDBC client side.
 
-.. todo::
\ No newline at end of file