You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tajo.apache.org by hy...@apache.org on 2014/03/07 04:14:27 UTC
git commit: TAJO-659: Add Tajo JDBC documentation.
Repository: incubator-tajo
Updated Branches:
refs/heads/master 399c600c7 -> e3da0cafd
TAJO-659: Add Tajo JDBC documentation.
Project: http://git-wip-us.apache.org/repos/asf/incubator-tajo/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-tajo/commit/e3da0caf
Tree: http://git-wip-us.apache.org/repos/asf/incubator-tajo/tree/e3da0caf
Diff: http://git-wip-us.apache.org/repos/asf/incubator-tajo/diff/e3da0caf
Branch: refs/heads/master
Commit: e3da0cafd4e94e024210315545c78d86f3ad66bc
Parents: 399c600
Author: Hyunsik Choi <hy...@apache.org>
Authored: Fri Mar 7 12:14:02 2014 +0900
Committer: Hyunsik Choi <hy...@apache.org>
Committed: Fri Mar 7 12:14:02 2014 +0900
----------------------------------------------------------------------
CHANGES.txt | 2 +
.../configuration/configuration_defaults.rst | 1 -
.../src/main/sphinx/hcatalog_integration.rst | 36 ++++-
tajo-docs/src/main/sphinx/jdbc_driver.rst | 132 ++++++++++++++++++-
4 files changed, 168 insertions(+), 3 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/incubator-tajo/blob/e3da0caf/CHANGES.txt
----------------------------------------------------------------------
diff --git a/CHANGES.txt b/CHANGES.txt
index b8c22ef..07048dd 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -521,6 +521,8 @@ Release 0.8.0 - unreleased
TASKS
+ TAJO-659: Add Tajo JDBC documentation. (hyunsik)
+
TAJO-642: Change tajo documentation tool to sphinx. (hyunsik)
TAJO-632: add intellij idea projects files into git ignore.
http://git-wip-us.apache.org/repos/asf/incubator-tajo/blob/e3da0caf/tajo-docs/src/main/sphinx/configuration/configuration_defaults.rst
----------------------------------------------------------------------
diff --git a/tajo-docs/src/main/sphinx/configuration/configuration_defaults.rst b/tajo-docs/src/main/sphinx/configuration/configuration_defaults.rst
index d1e3add..5fcbe67 100644
--- a/tajo-docs/src/main/sphinx/configuration/configuration_defaults.rst
+++ b/tajo-docs/src/main/sphinx/configuration/configuration_defaults.rst
@@ -2,7 +2,6 @@
Configuration Defaults
**********************
-====================================
Tajo Master Configuration Defaults
====================================
http://git-wip-us.apache.org/repos/asf/incubator-tajo/blob/e3da0caf/tajo-docs/src/main/sphinx/hcatalog_integration.rst
----------------------------------------------------------------------
diff --git a/tajo-docs/src/main/sphinx/hcatalog_integration.rst b/tajo-docs/src/main/sphinx/hcatalog_integration.rst
index 552ff82..7337346 100644
--- a/tajo-docs/src/main/sphinx/hcatalog_integration.rst
+++ b/tajo-docs/src/main/sphinx/hcatalog_integration.rst
@@ -2,4 +2,38 @@
HCatalog Integration
*************************************
-.. todo::
\ No newline at end of file
+Apache Tajo™ catalog supports HCatalogStore driver to integrate with Apache Hive™.
+This integration allows Tajo to access all tables used in Apache Hive.
+Depending on your purpose, you can execute either SQL queries or HiveQL queries on the
+same tables managed in Apache Hive.
+
+In order to use this feature, you need to build Tajo with a specified maven profile
+and then add some configs into ``conf/tajo-env.sh`` and ``conf/catalog-site.xml``.
+This section describes how to setup HCatalog integration.
+This instruction would take no more than ten minutes.
+
+First, you need to compile the source code with hcatalog profile.
+Currently, Tajo supports hcatalog-0.11.0 and hcatalog-0.12.0 profile.
+So, if you want to use Hive 0.11.0, you need to set ``-Phcatalog-0.11.0`` as the maven profile ::
+
+ $ mvn clean package -DskipTests -Pdist -Dtar -Phcatalog-0.11.0
+
+Or, if you want to use Hive 0.12.0, you need to set ``-Phcatalog-0.12.0`` as the maven profile ::
+
+ $ mvn clean package -DskipTests -Pdist -Dtar -Phcatalog-0.12.0
+
+Then, you need to set your Hive home directory to the environment variable ``HIVE_HOME`` in conf/tajo-env.sh as follows: ::
+
+ export HIVE_HOME=/path/to/your/hive/directory
+
+If you need to use jdbc to connect HiveMetaStore, you have to prepare MySQL jdbc driver.
+Next, you should set the path of MySQL JDBC driver jar file to the environment variable HIVE_JDBC_DRIVER_DIR in conf/tajo-env.sh as follows: ::
+
+ export HIVE_JDBC_DRIVER_DIR==/path/to/your/mysql_jdbc_driver/mysql-connector-java-x.x.x-bin.jar
+
+Finally, you should specify HCatalogStore as Tajo catalog driver class in ``conf/catalog-site.xml`` as follows: ::
+
+ <property>
+ <name>tajo.catalog.store.class</name>
+ <value>org.apache.tajo.catalog.store.HCatalogStore</value>
+ </property>
http://git-wip-us.apache.org/repos/asf/incubator-tajo/blob/e3da0caf/tajo-docs/src/main/sphinx/jdbc_driver.rst
----------------------------------------------------------------------
diff --git a/tajo-docs/src/main/sphinx/jdbc_driver.rst b/tajo-docs/src/main/sphinx/jdbc_driver.rst
index 153fbff..306cdc1 100644
--- a/tajo-docs/src/main/sphinx/jdbc_driver.rst
+++ b/tajo-docs/src/main/sphinx/jdbc_driver.rst
@@ -2,8 +2,138 @@
Tajo JDBC Driver
*************************************
+Apache Tajo™ provides JDBC driver
+which enables Java applciations to easily access Apache Tajo in a RDBMS-like manner.
+In this section, we explain how to get JDBC driver and an example code.
+
How to get JDBC driver
=======================
+Tajo provides some necesssary jar files packaged by maven. In order get the jar files,
+please follow the below commands.
+
+.. code-block:: bash
+
+ $ cd tajo-x.y.z-incubating
+ $ mvn clean package -DskipTests -Pdist -Dtar
+ $ ls -l tajo-dist/target/tajo-x.y.z-incubating/share/jdbc-dist
+
+
+Setting the CLASSPATH
+=======================
+
+In order to use the JDBC driver, you should set the jar files included in
+``tajo-dist/target/tajo-x.y.z-incubating/share/jdbc-dist`` to your ``CLASSPATH``.
+In addition, you should add hadoop clsspath into your ``CLASSPATH``.
+So, ``CLASSPATH`` will be set as follows:
+
+.. code-block:: bash
+
+ CLASSPATH=path/to/tajo-jdbc/*:${TAJO_HOME}/conf:$(hadoop classpath)
+
+.. note::
+
+ You can get ${hadoop classpath} by executing the command ``bin/hadoop classpath`` in your hadoop cluster.
+
+.. note::
+
+ You may want to a minimal set of JAR files. If so, please refer :ref:`minimal_jar_files`.
+
+An Example JDBC Client
+=======================
+
+The JDBC driver class name is ``org.apache.tajo.jdbc.TajoDriver``.
+You can get the driver ``Class.forName("org.apache.tajo.jdbc.TajoDriver").newInstance()``.
+The connection url should be ``jdbc:tajo://<TajoMaster hostname>:<TajoMaster client rpc port>``.
+The default TajoMaster client rpc port is ``26002``.
+If you want to change the listening port, please refer :doc:`/configuration/configuration_defaults`.
+
+.. note::
+
+ Currently, Tajo does not support the concept of database and namespace.
+ All tables are contained in ``default`` database. So, you don't need to specify any database name.
+
+The following shows an example of JDBC Client.
+
+.. code-block:: java
+
+ import java.sql.Connection;
+ import java.sql.ResultSet;
+ import java.sql.Statement;
+ import java.sql.DriverManager;
+
+ public class TajoJDBCClient {
+
+ ....
+
+ public static void main(String[] args) throws Exception {
+ Class.forName("org.apache.tajo.jdbc.TajoDriver").newInstance();
+ Connection conn = DriverManager.getConnection("jdbc:tajo://127.0.0.1:26002");
+
+ Statement stmt = null;
+ ResultSet rs = null;
+ try {
+ stmt = conn.createStatement();
+ rs = stmt.executeQuery("select * from table1");
+ while (rs.next()) {
+ System.out.println(rs.getString(1) + "," + rs.getString(3));
+ }
+ } finally {
+ if (rs != null) rs.close();
+ if (stmt != null) stmt.close();
+ if (conn != null) conn.close();
+ }
+ }
+ }
+
+
+Appendix
+===========================================
+
+.. _minimal_jar_files:
+
+Minimal JAR file list
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The following JAR files are necessary minimal JAR file list.
+We've tested JDBC drivers with the following JAR files for
+usual SQL queries. But, they does not guarantee that they are
+fully tested for all operations. So, you may need additional JAR files.
+In addition to the following JAR files, please don't forgot including
+``${HADOOP_HOME}/eta/hadoop`` and ``${TAJO_HOME}/conf`` in your ``CLASSPATH``.
+
+ * hadoop-annotations-2.2.0.jar
+ * hadoop-auth-2.2.0.jar
+ * hadoop-common-2.2.0.jar
+ * hadoop-hdfs-2.2.0.jar
+ * joda-time-2.3.jar
+ * tajo-catalog-common-0.8.0-SNAPSHOT.jar
+ * tajo-client-0.8.0-SNAPSHOT.jar
+ * tajo-common-0.8.0-SNAPSHOT.jar
+ * tajo-jdbc-0.8.0-SNAPSHOT.jar
+ * tajo-rpc-0.8.0-SNAPSHOT.jar
+ * tajo-storage-0.8.0-SNAPSHOT.jar
+ * log4j-1.2.17.jar
+ * commons-logging-1.1.1.jar
+ * guava-11.0.2.jar
+ * protobuf-java-2.5.0.jar
+ * netty-3.6.2.Final.jar
+ * commons-lang-2.5.jar
+ * commons-configuration-1.6.jar
+ * slf4j-api-1.7.5.jar
+ * slf4j-log4j12-1.7.5.jar
+ * commons-cli-1.2.jar
+ * commons-io-2.1.jar"
+
+
+FAQ
+===========================================
+
+java.nio.channels.UnresolvedAddressException
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+When retriving the final result, Tajo JDBC Driver tries to access HDFS data nodes.
+So, the network access between JDBC client and HDFS data nodes must be available.
+In many cases, a HDFS cluster is built in a private network which use private hostnames.
+So, the host names must be shared with the JDBC client side.
-.. todo::
\ No newline at end of file