You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@sqoop.apache.org by ja...@apache.org on 2014/12/13 16:56:46 UTC
sqoop git commit: SQOOP-1757: Sqoop2: Document generic jdbc connector
Repository: sqoop
Updated Branches:
refs/heads/sqoop2 293e9ef63 -> ee097891b
SQOOP-1757: Sqoop2: Document generic jdbc connector
(Abraham Elmahrek via Jarek Jarcec Cecho)
Project: http://git-wip-us.apache.org/repos/asf/sqoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/sqoop/commit/ee097891
Tree: http://git-wip-us.apache.org/repos/asf/sqoop/tree/ee097891
Diff: http://git-wip-us.apache.org/repos/asf/sqoop/diff/ee097891
Branch: refs/heads/sqoop2
Commit: ee097891b727acc6dd0b1f6222d7758e436f71c9
Parents: 293e9ef
Author: Jarek Jarcec Cecho <ja...@apache.org>
Authored: Sat Dec 13 07:55:31 2014 -0800
Committer: Jarek Jarcec Cecho <ja...@apache.org>
Committed: Sat Dec 13 07:55:31 2014 -0800
----------------------------------------------------------------------
docs/src/site/sphinx/Connectors.rst | 199 +++++++++++++++++++++++++++++++
docs/src/site/sphinx/index.rst | 1 +
2 files changed, 200 insertions(+)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/sqoop/blob/ee097891/docs/src/site/sphinx/Connectors.rst
----------------------------------------------------------------------
diff --git a/docs/src/site/sphinx/Connectors.rst b/docs/src/site/sphinx/Connectors.rst
new file mode 100644
index 0000000..bcc5b43
--- /dev/null
+++ b/docs/src/site/sphinx/Connectors.rst
@@ -0,0 +1,199 @@
+.. Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+
+
+==================
+Sqoop 2 Connectors
+==================
+
+This document describes how to use the built-in connectors. This includes a detailed description of how connectors partition, format their output, extract data, and load data.
+
+.. contents::
+ :depth: 3
+
+++++++++++++++++++++++
+Generic JDBC Connector
+++++++++++++++++++++++
+
+The Generic JDBC Connector can connect to any data source that adheres to the **JDBC 4** specification.
+
+-----
+Usage
+-----
+
+To use the Generic JDBC Connector, create a link for the connector and a job that uses the link.
+
+**Link Configuration**
+++++++++++++++++++++++
+
+Inputs associated with the link configuration include:
+
++-----------------------------+---------+-----------------------------------------------------------------------+------------------------------------------+
+| Input | Type | Description | Example |
++=============================+=========+=======================================================================+==========================================+
+| JDBC Driver Class | String | The full class name of the JDBC driver. | com.mysql.jdbc.Driver |
+| | | *Required* and accessible by the Sqoop server. | |
++-----------------------------+---------+-----------------------------------------------------------------------+------------------------------------------+
+| JDBC Connection String | String | The JDBC connection string to use when connecting to the data source. | jdbc:mysql://localhost/test |
+| | | *Required*. Connectivity upon creation is optional. | |
++-----------------------------+---------+-----------------------------------------------------------------------+------------------------------------------+
+| Username | String | The username to provide when connecting to the data source. | sqoop |
+| | | *Optional*. Connectivity upon creation is optional. | |
++-----------------------------+---------+-----------------------------------------------------------------------+------------------------------------------+
+| Password | String | The password to provide when connecting to the data source. | sqoop |
+| | | *Optional*. Connectivity upon creation is optional. | |
++-----------------------------+---------+-----------------------------------------------------------------------+------------------------------------------+
+| JDBC Connection Properties | Map | A map of JDBC connection properties to pass to the JDBC driver | profileSQL=true&useFastDateParsing=false |
+| | | *Optional*. | |
++-----------------------------+---------------------------------------------------------------------------------+------------------------------------------+
+
+**FROM Job Configuration**
+++++++++++++++++++++++++++
+
+Inputs associated with the Job configuration for the FROM direction include:
+
++-----------------------------+---------+-------------------------------------------------------------------------+---------------------------------------------+
+| Input | Type | Description | Example |
++=============================+=========+=========================================================================+=============================================+
+| Schema name | String | The schema name the table is part of. | sqoop |
+| | | *Optional* | |
++-----------------------------+---------+-------------------------------------------------------------------------+---------------------------------------------+
+| Table name | String | The table name to import data from. | test |
+| | | *Optional*. See note below. | |
++-----------------------------+---------+-------------------------------------------------------------------------+---------------------------------------------+
+| Table SQL statement | String | The SQL statement used to perform a **free form query**. | ``SELECT COUNT(*) FROM test ${CONDITIONS}`` |
+| | | *Optional*. See note below. | |
++-----------------------------+---------+-------------------------------------------------------------------------+---------------------------------------------+
+| Table column names | String | Columns to extract from the JDBC data source. | col1,col2 |
+| | | *Optional* Comma separated list of columns. | |
++-----------------------------+---------+-------------------------------------------------------------------------+---------------------------------------------+
+| Partition column name | Map | The column name used to partition the data transfer process. | col1 |
+| | | *Optional*. Defaults to primary key of table. | |
++-----------------------------+---------+-------------------------------------------------------------------------+---------------------------------------------+
+| Null value allowed for | Boolean | True or false depending on whether NULL values are allowed in data | true |
+| the partition column | | of the Partition column. *Optional*. | |
++-----------------------------+---------+-------------------------------------------------------------------------+---------------------------------------------+
+| Boundary query | String | The query used to define an upper and lower boundary when partitioning. | |
+| | | *Optional*. | |
++-----------------------------+-----------------------------------------------------------------------------------+---------------------------------------------+
+
+**Notes**
+=========
+
+1. *Table name* and *Table SQL statement* are mutually exclusive. If *Table name* is provided, the *Table SQL statement* should not be provided. If *Table SQL statement* is provided then *Table name* should not be provided.
+2. *Table column names* should be provided only if *Table name* is provided.
+
+**TO Job Configuration**
+++++++++++++++++++++++++
+
+Inputs associated with the Job configuration for the TO direction include:
+
++-----------------------------+---------+-------------------------------------------------------------------------+-------------------------------------------------+
+| Input | Type | Description | Example |
++=============================+=========+=========================================================================+=================================================+
+| Schema name | String | The schema name the table is part of. | sqoop |
+| | | *Optional* | |
++-----------------------------+---------+-------------------------------------------------------------------------+-------------------------------------------------+
+| Table name | String | The table name to import data from. | test |
+| | | *Optional*. See note below. | |
++-----------------------------+---------+-------------------------------------------------------------------------+-------------------------------------------------+
+| Table SQL statement | String | The SQL statement used to perform a **free form query**. | ``INSERT INTO test (col1, col2) VALUES (?, ?)`` |
+| | | *Optional*. See note below. | |
++-----------------------------+---------+-------------------------------------------------------------------------+-------------------------------------------------+
+| Table column names | String | Columns to insert into the JDBC data source. | col1,col2 |
+| | | *Optional* Comma separated list of columns. | |
++-----------------------------+---------+-------------------------------------------------------------------------+-------------------------------------------------+
+| Stage table name | String | The name of the table used as a *staging table*. | staging |
+| | | *Optional*. | |
++-----------------------------+---------+-------------------------------------------------------------------------+-------------------------------------------------+
+| Should clear stage table | Boolean | True or false depending on whether the staging table should be cleared | true |
+| | | after the data transfer has finished. *Optional*. | |
++-----------------------------+-----------------------------------------------------------------------------------+-------------------------------------------------+
+
+**Notes**
+=========
+
+1. *Table name* and *Table SQL statement* are mutually exclusive. If *Table name* is provided, the *Table SQL statement* should not be provided. If *Table SQL statement* is provided then *Table name* should not be provided.
+2. *Table column names* should be provided only if *Table name* is provided.
+
+-----------
+Partitioner
+-----------
+
+The Generic JDBC Connector partitioner generates conditions to be used by the extractor.
+It varies in how it partitions data transfer based on the partition column data type.
+Though, each strategy roughly takes on the following form:
+::
+
+ (upper boundary - lower boundary) / (max partitions)
+
+By default, the *primary key* will be used to partition the data unless otherwise specified.
+
+The following data types are currently supported:
+
+1. TINYINT
+2. SMALLINT
+3. INTEGER
+4. BIGINT
+5. REAL
+6. FLOAT
+7. DOUBLE
+8. NUMERIC
+9. DECIMAL
+10. BIT
+11. BOOLEAN
+12. DATE
+13. TIME
+14. TIMESTAMP
+15. CHAR
+16. VARCHAR
+17. LONGVARCHAR
+
+---------
+Extractor
+---------
+
+During the *extraction* phase, the JDBC data source is queried using SQL. This SQL will vary based on your configuration.
+
+- If *Table name* is provided, then the SQL statement generated will take on the form ``SELECT * FROM <table name>``.
+- If *Table name* and *Columns* are provided, then the SQL statement generated will take on the form ``SELECT <columns> FROM <table name>``.
+- If *Table SQL statement* is provided, then the provided SQL statement will be used.
+
+The conditions generated by the *partitioner* are appended to the end of the SQL query to query a section of data.
+
+The Generic JDBC connector extracts CSV data usable by the *CSV Intermediate Data Format*.
+
+------
+Loader
+------
+
+During the *loading* phase, the JDBC data source is queried using SQL. This SQL will vary based on your configuration.
+
+- If *Table name* is provided, then the SQL statement generated will take on the form ``INSERT INTO <table name> (col1, col2, ...) VALUES (?,?,..)``.
+- If *Table name* and *Columns* are provided, then the SQL statement generated will take on the form ``INSERT INTO <table name> (<columns>) VALUES (?,?,..)``.
+- If *Table SQL statement* is provided, then the provided SQL statement will be used.
+
+This connector expects to receive CSV data consumable by the *CSV Intermediate Data Format*.
+
+----------
+Destroyers
+----------
+
+The Generic JDBC Connector performs two operations in the destroyer in the TO direction:
+
+1. Copy the contents of the staging table to the desired table.
+2. Clear the staging table.
+
+No operations are performed in the FROM direction.
http://git-wip-us.apache.org/repos/asf/sqoop/blob/ee097891/docs/src/site/sphinx/index.rst
----------------------------------------------------------------------
diff --git a/docs/src/site/sphinx/index.rst b/docs/src/site/sphinx/index.rst
index 8257858..9c95d08 100644
--- a/docs/src/site/sphinx/index.rst
+++ b/docs/src/site/sphinx/index.rst
@@ -47,6 +47,7 @@ If you are excited to start using Sqoop you can follow the links below to get a
- `Sqoop 5 Minute Demo <Sqoop5MinutesDemo.html>`_
- `Command Line Shell Usage Guide <CommandLineClient.html>`_
+- `Connectors <Connectors.html>`_
Developer Guide
-----------------