You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/03/04 15:26:00 UTC

[jira] [Work logged] (HIVE-24396) [New Feature] Add data connector support for remote datasources

     [ https://issues.apache.org/jira/browse/HIVE-24396?focusedWorklogId=560999&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560999 ]

ASF GitHub Bot logged work on HIVE-24396:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 04/Mar/21 15:25
            Start Date: 04/Mar/21 15:25
    Worklog Time Spent: 10m 
      Work Description: nrg4878 opened a new pull request #2037:
URL: https://github.com/apache/hive/pull/2037


   Adding support for data connectors in HiveQL. Added DDL support for create, drop, alter, describe, show operations.
   CREATE CONNECTOR type 'mysql' URL 'jdbcURL' COMMENT 'comment' WITH DCPROPERTIES(...);
   
   Adding a type to the Hive Databases
   CREATE DATABASE --> creates a NATIVE database
   CREATE REMOTE DATABASE --> creates a REMOTE database that maps to a database in another datasource.
   example: a database in a RDBMS
   
   Added DDL support for REMOTE databases (create, drop, describe, alter)
   Beeline displays slightly different content for NATIVE databases vs REMOTE databases.
   
   Provided partial implementations for MySQL and Postgres (getTables spec yet to be implemented, just getTable and getTableNames implemented)
   
   Pending changes:
   These changes are for initial review. The changes are incomplete
   
   Schema changes only has DERBY support. Schema changes for other DBs yet to come.
   Unit tests
   qtests
   getTables() implementation
   Support for DERBY/MSSQL/ORACLE data connectors.
   Currently HS2 converts all table names to lower case. So if remote datasources support cases sensitivity with table names, running queries will not work as HMS will not be able to find remote table metadata. This is yet to be addressed.
   User facing documentation.
   
   Why are the changes needed?
   These changes enable hive to be able to run queries against remote datasources whose metadata does not reside in HMS. Its give hive the ability to map many tables at once without having to define each of them individually.
   
   Does this PR introduce any user-facing change?
   Yes, this introduces user facing changes but should be backward compatible.
   New Grammer changes:
   create connector ... -> creates a connector metadata in hive. This is where all the connection related properties are, like credentials and URLs etc.
   drop connector --> drops the connector metadata
   describe connector [extended] --> describes an existing connector
   show connectors --> displays all connectors
   alter connector --> alters existing metadata for a connector
   
   Change to existing grammer:
   create REMOTE database USING <connector_name> WITH DBPROPERTIES.
   creates a remote database using a data connector.
   
   How was this patch tested?
   Manually with MySQL and POSTGRES databases.
   Pending Unit and qtests.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 560999)
    Remaining Estimate: 0h
            Time Spent: 10m

> [New Feature] Add data connector support for remote datasources
> ---------------------------------------------------------------
>
>                 Key: HIVE-24396
>                 URL: https://issues.apache.org/jira/browse/HIVE-24396
>             Project: Hive
>          Issue Type: Improvement
>          Components: Hive
>            Reporter: Naveen Gangam
>            Assignee: Naveen Gangam
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> This feature work is to be able to support in Hive Metastore to be able to configure data connectors for remote datasources and map databases. We currently have support for remote tables via StorageHandlers like JDBCStorageHandler and HBaseStorageHandler.
> Data connectors are a natural extension to this where we can map an entire database or catalogs instead of individual tables. The tables within are automagically mapped at runtime. The metadata for these tables are not persisted in Hive. They are always mapped and built at runtime. 
> With this feature, we introduce a concept of type for Databases in Hive. NATIVE vs REMOTE. All current databases are NATIVE. To create a REMOTE database, the following syntax is to be used
> CREATE REMOTE DATABASE remote_db USING <dataconnector> WITH DCPROPERTIES (....);
> Will attach a design doc to this jira. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)