You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2009/08/07 04:04:01 UTC

[Hadoop Wiki] Update of "Hive/HiveODBC" by EricHwang

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by EricHwang:
http://wiki.apache.org/hadoop/Hive/HiveODBC

New page:
== Hive ODBC Driver ==
The Hive ODBC Driver is a software library that implements the Open Database Connectivity (ODBC) API standard for the Hive database management system, enabling ODBC compliant applications to interact seamlessly (ideally) with Hive through a standard interface.

=== Suggested Reading ===
This guide assumes you are already familiar with the following:
 * [wiki:Hive Hive]
 * [wiki:Hive/HiveServer Hive Server]
 * [http://wiki.apache.org/thrift/ Thrift]
 * [http://msdn.microsoft.com/en-us/library/ms714177(VS.85).aspx ODBC API]
 * [http://www.unixodbc.org/ unixODBC]

=== Software Requirements ===
The following software components are needed for the successful compilation and operation of the Hive ODBC driver:
 * '''Hive Server''' - a service through which clients may remotely issue Hive commands and requests. The Hive ODBC driver depends on Hive Server to perform the core set of database interactions. Hive Server is built as part of the Hive build process. More information regarding Hive Server usage can be found [wiki:Hive/HiveServer here].
 * '''Apache Thrift''' - a scalable cross-language software framework that enables the Hive ODBC driver (specifically the Hive client) to communicate with the Hive Server. See here for the details on [http://wiki.apache.org/thrift/ThriftInstallation Thrift Installation]. The Hive ODBC driver was developed with Thrift trunk version r790732, but the latest revision should also be fine. Make sure you note the Thrift install path during the Thrift build process as this information will be needed during the Hive client build process. The Thrift install path will be referred to as THRIFT_HOME.

=== Driver Architecture ===
Internally, the Hive ODBC Driver contains two separate components: Hive client, and the unixODBC API wrapper.
 * '''Hive client''' - provides a set of C-compatible library functions to interact with Hive Server in a pattern similar to those dictated by the ODBC specification. However, Hive client was designed to be independent of unixODBC or any ODBC specific headers, allowing it to be used in any number of generic cases beyond ODBC.
 * '''unixODBC API wrapper''' - provides a layer on top of Hive client that directly implements the ODBC API standard. The unixODBC API wrapper will be compiled into a shared object library, which will be the final form of the Hive ODBC driver. This portion will remain a file attachment on the associated JIRA until it can be checked into the unixODBC code repository: [https://issues.apache.org/jira/browse/HIVE-187 HIVE-187].

NOTE: Hive client needs to be built and installed before the unixODBC API wrapper can compile successfully.

==== Hive Client Build/Setup ====
In order to build the Hive client:
 1. Checkout and setup the latest version of Apache Hive. For more details, see [wiki:Hive/GettingStarted Getting Started with Hive]. From this point onwards, the path to the Hive root directory will be referred to as HIVE_HOME.
 1. Build the Hive client by running the following command from HIVE_HOME. This will compile and copy the libraries and header files to {{{HIVE_HOME/build/odbc/}}}. Please keep in mind that all paths should be fully specified (no relative paths).
 {{{
 $ ant compile-cpp -Dthrift.home=<THRIFT_HOME>
 }}}
 You can optionally force Hive client to compile into a non-native bit architecture by specifying the additional parameter (assuming you have the proper compilation libraries):
 {{{
 $ ant compile-cpp -Dthrift.home=<THRIFT_HOME> -Dword.size=<32 or 64>
 }}}
 You can verify the compilation by running the Hive client test suite. You can specifically execute the Hive client tests by running the following command from {{{HIVE_HOME/odbc/}}}. NOTE: Hive client tests require that a local Hive Server be operating on port 10000.
 {{{
 $ ant test
 }}}
 1.#3 To install the Hive client libraries onto your machine, run the following command from {{{HIVE_HOME/odbc/}}}. NOTE: The install path defaults to {{{/usr/local}}}, but this can be changed by setting the {{{INSTALL_PATH}}} environment variable to a desired alternative.
 {{{
 $ ant install -Dthrift.home=<THRIFT_HOME>
 }}}

==== unixODBC API Wrapper Build/Setup ====
After you have built and installed the Hive client, you can now install the unixODBC API wrapper:
 1. In the unixODBC root directory, run the following command:
 {{{
 $ ./configure --enable-gui=no --prefix=<unixODBC_INSTALL_DIR>
 }}}
 If you encounter the the errors: "{{{redefinition of 'struct _hist_entry'}}}" or "{{{previous declaration of 'add_history' was here}}}" then re-execute the configure with the following command:
 {{{
 $ ./configure --enable-gui=no --enable-readline=no --prefix=<unixODBC_INSTALL_DIR>
 }}}
 1.#2 Compile the unixODBC API wrapper with the following:
 {{{
 $ make
 }}}
 To force the compilation of the unixODBC API wrapper into a non-native bit architecture, modify the CC and CXX environment variables to include the appropriate flags. For example:
 {{{
 $ CC="gcc -m32" CXX="g++ -m32" make
 }}}
 1.#3 If you want to completely install unixODBC and all related drivers:
  a. Run the following from the unixODBC root directory:
  {{{
  $ make install
  }}}
  a.#2 If your system complains about {{{undefined symbols}}} during unixODBC testing (such as with {{{isql}}} or {{{odbcinst}}}) after installation, try running {{{ldconfig}}} to update your library catalog.
 1.#4 If you only want to obtain the Hive ODBC driver shared object library:
  a. After compilation, the driver will be located at {{{<unixODBC_BUILD_DIR>/Drivers/hive/.libs/libodbchive.so.1.0.0}}}.
  a. This may be copied to any other location as desired. Keep in mind that the Hive ODBC driver has a dependency on the Hive client shared object library: {{{libhiveclient.so}}}.
  a. You can manually install the unixODBC API wrapper by doing the following:
  {{{
  $ cp <unixODBC_BUILD_DIR>/Drivers/hive/.libs/libodbchive.so.1.0.0 <SYSTEM_INSTALL_DIR>
  $ cd <SYSTEM_INSTALL_DIR>
  $ ln -s libodbchive.so.1.0.0 libodbchive.so
  $ ldconfig
  }}} 

=== Connecting the Driver to a Driver Manager ===
This portion assumes that you have already built and installed both the Hive client and the unixODBC API wrapper shared libraries on the current machine. To connect the Hive ODBC driver to a previously installed Driver Manager (such as the one provided by unixODBC or a separate application):
 1. Locate the odbc.ini file associated with the Driver Manager (DM):
  a. If you are installing the driver on the system DM, then you can run the following command to print the locations of DM configuration files.
  {{{
  $ odbcinst -j
  unixODBC 2.2.14
  DRIVERS............: /usr/local/etc/odbcinst.ini
  SYSTEM DATA SOURCES: /usr/local/etc/odbc.ini
  FILE DATA SOURCES..: /usr/local/etc/ODBCDataSources
  USER DATA SOURCES..: /home/ehwang/.odbc.ini
  SQLULEN Size.......: 8
  SQLLEN Size........: 8
  SQLSETPOSIROW Size.: 8
  }}}
  a.#2 If you are installing the driver on an application DM, then you have to help yourself on this one ;). Hint: try looking in the installation directory of your application.
   i. Keep in mind that an application's DM can exist simultaneously with the system DM and will likely use its own configuration files, such as odbc.ini.
   i. Also, note that some applications do not have their own DMs and simply use the system DM.
 1. Add the following ini configuration entry to the DM's corresponding odbc.ini:
 {{{
 [Hive]
 Driver = <path_to_libodbchive.so>
 Description = Hive Driver v1
 DATABASE = default
 HOST = <Hive_server_address>
 PORT = <Hive_server_port>
 FRAMED = 0
 }}}

=== Testing with ISQL ===
Once you have installed the necessary Hive ODBC libraries and added a Hive entry in your system's default odbc.ini, you will be able to interactively test the driver with isql:
{{{
$ isql -v Hive
}}}
If your system does not have isql, you can obtain it by installing the entirety of unixODBC.

=== Current Status ===
 * Limitations:
  * No support for Unicode
  * Not thread safe
  * No support for asynchronous execution of queries
  * Does not check for memory allocation errors
 * ODBC API Function Support:
  * SQLAllocConnect - supported
  * SQLAllocEnv   - supported
  * SQLAllocHandle - supported
  * SQLAllocStmt  - supported
  * SQLBindCol  - supported
  * SQLBindParameter - NOT supported
  * SQLCancel - NOT supported
  * SQLColAttribute – supported
  * SQLColumns – supported
  * SQLConnect - supported
  * SQLDescribeCol – supported
  * SQLDescribeParam – NOT supported
  * SQLDisconnect - supported
  * SQLDriverConnect - supported
  * SQLError – supported
  * SQLExecDirect - supported
  * SQLExecute - supported
  * SQLExtendedFetch – NOT supported
  * SQLFetch - supported
  * SQLFetchScroll – NOT supported
  * SQLFreeConnect - supported
  * SQLFreeEnv - supported
  * SQLFreeHandle - supported
  * SQLFreeStmt - supported
  * SQLGetConnectAttr – NOT supported
  * SQLGetData – supported (however, SQLSTATE not returning values)
  * SQLGetDiagField – NOT supported
  * SQLGetDiagRec - supported
  * SQLGetInfo – NOT supported; a limited version may be provided
  * SQLMoreResults – NOT supported
  * SQLNumParams – NOT supported
  * SQLNumResultCols - supported
  * SQLParamOptions – NOT supported
  * SQLPrepare – supported; but does not permit parameter markers
  * SQLRowCount – NOT supported
  * SQLSetConnectAttr – NOT supported
  * SQLSetConnectOption – NOT supported
  * SQLSetEnvAttr – Limited support
  * SQLSetStmtAttr – NOT supported
  * SQLSetStmtOption – NOT supported
  * SQLTables – supported
  * SQLTransact – NOT supported