You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@sqoop.apache.org by ab...@apache.org on 2015/01/21 04:04:53 UTC

sqoop git commit: SQOOP-1908: Sqoop2: Document external connector support

Repository: sqoop
Updated Branches:
  refs/heads/sqoop2 1f89de217 -> e41bc6e31


SQOOP-1908: Sqoop2: Document external connector support

(Veena Basavaraj via Abraham Elmahrek)


Project: http://git-wip-us.apache.org/repos/asf/sqoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/sqoop/commit/e41bc6e3
Tree: http://git-wip-us.apache.org/repos/asf/sqoop/tree/e41bc6e3
Diff: http://git-wip-us.apache.org/repos/asf/sqoop/diff/e41bc6e3

Branch: refs/heads/sqoop2
Commit: e41bc6e31f5deb00524a785945eed19db4b4d30e
Parents: 1f89de2
Author: Abraham Elmahrek <ab...@apache.org>
Authored: Tue Jan 20 19:03:50 2015 -0800
Committer: Abraham Elmahrek <ab...@apache.org>
Committed: Tue Jan 20 19:03:50 2015 -0800

----------------------------------------------------------------------
 docs/src/site/sphinx/ConnectorDevelopment.rst | 102 +++++++++++++++++++++
 docs/src/site/sphinx/index.rst                |   2 +-
 2 files changed, 103 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/sqoop/blob/e41bc6e3/docs/src/site/sphinx/ConnectorDevelopment.rst
----------------------------------------------------------------------
diff --git a/docs/src/site/sphinx/ConnectorDevelopment.rst b/docs/src/site/sphinx/ConnectorDevelopment.rst
index 280b502..16b0ecf 100644
--- a/docs/src/site/sphinx/ConnectorDevelopment.rst
+++ b/docs/src/site/sphinx/ConnectorDevelopment.rst
@@ -62,6 +62,19 @@ Connectors can optionally override the following methods:
   public List<Direction> getSupportedDirections();
   public Class<? extends IntermediateDataFormat<?>> getIntermediateDataFormat()
 
+The ``getVersion`` method returns the current version of the connector
+It is important to provide a unique identifier every time a connector jar is released externally.
+In case of the Sqoop built-in connectors, the version refers to the Sqoop build/release version. External
+connectors can also use the same or similar mechanism to set this version. The version number is critical for
+the connector upgrade logic used in Sqoop
+
+::
+
+   @Override
+    public String getVersion() {
+     return VersionInfo.getBuildVersion();
+    }
+
 
 The ``getFrom`` method returns From_ instance
 which is a ``Transferable`` entity that encapsulates the operations
@@ -237,6 +250,72 @@ Loader must iterate in the ``load`` method until the data from ``DataReader`` is
 
 NOTE: we do not yet support a stage for connector developers to control how to balance the loading/writitng of data across the mutiple loaders. In future we may be adding this to the connector API to have custom logic to balance the loading across multiple reducers.
 
+Sqoop Connector Identifier : sqoopconnector.properties
+======================================================
+
+Every Sqoop 2 connector needs to have a sqoopconnector.properties in the packaged jar to be identified by Sqoop.
+A typical ``sqoopconnector.properties`` for a sqoop2 connector looks like below
+
+::
+
+ # Sqoop Foo Connector Properties
+ org.apache.sqoop.connector.class = org.apache.sqoop.connector.foo.FooConnector
+ org.apache.sqoop.connector.name = sqoop-foo-connector
+
+If the above file does not exist, then Sqoop will not load this jar and thus cannot be registered into Sqoop repository for creating Sqoop jobs
+
+
+Sqoop Connector Build-time Dependencies
+=======================================
+
+Sqoop provides the connector-sdk module identified by the package:``org.apache.sqoop.connector`` It provides the public facing apis for the external connectors
+to extend from. It also provides common utilities that the connectors can utilize for converting data to and from the sqoop intermediate data format
+
+The common-test module identified by the package  ``org.apache.sqoop.common.test`` provides utilities used related to the built-in connectors such as the JDBC, HDFS,
+and Kafka connectors that can be used by the external connectors for creating the end-end integration test for sqoop jobs
+
+The test module identified by the package ``org.apache.sqoop.test`` provides various minicluster utilites the integration tests can extend from to run
+ a sqoop job with the given sqoop connector either using it as a ``FROM`` or ``TO`` data-source
+
+Hence the pom.xml for the sqoop kite connector built using the kite-sdk  might look something like below
+
+::
+
+   <dependencies>
+    <!-- Sqoop modules -->
+    <dependency>
+      <groupId>org.apache.sqoop</groupId>
+      <artifactId>connector-sdk</artifactId>
+    </dependency>
+
+    <!-- Testing specified modules -->
+    <dependency>
+      <groupId>org.testng</groupId>
+      <artifactId>testng</artifactId>
+      <scope>test</scope>
+    </dependency>
+    <dependency>
+      <groupId>org.mockito</groupId>
+      <artifactId>mockito-all</artifactId>
+      <scope>test</scope>
+    </dependency>
+     <dependency>
+       <groupId>org.apache.sqoop</groupId>
+       <artifactId>sqoop-common-test</artifactId>
+     </dependency>
+
+     <dependency>
+       <groupId>org.apache.sqoop</groupId>
+       <artifactId>test</artifactId>
+     </dependency>
+    <!-- Connector required modules -->
+    <dependency>
+      <groupId>org.kitesdk</groupId>
+      <artifactId>kite-data-core</artifactId>
+    </dependency>
+    ....
+  </dependencies>
+
 Configurables
 +++++++++++++
 
@@ -370,6 +449,27 @@ Sqoop 2 provides a list of standard input validators that can be used by differe
 The validation logic is executed when users creating the sqoop jobs input values for the link and job configs associated with the ``From`` and ``To`` instances of the connectors associated with the job.
 
 
+Loading External Connectors
++++++++++++++++++++++++++++
+
+Loading new connector say sqoop-foo-connector to the sqoop2, here are the steps to follow
+
+1. Create a ``sqoop-foo-connector.jar``. Make sure the jar contains the ``sqoopconnector.properties`` for it to be picked up by Sqoop
+
+2. Add this jar to the a folder on your installation machine and update the path to this folder in the sqoop.properties located under the ``server/conf`` directory under the Sqoop2  for the key ``org.apache.sqoop.connector.external.loadpath``
+
+::
+
+ #
+ # External connectors load path
+ # "/path/to/external/connectors/": Add all the connector JARs in the specified folder
+ #
+ org.apache.sqoop.connector.external.loadpath=/path/to/connector
+
+3. Start the Sqoop 2 server and while initializing the server this jar should be loaded into the Sqoop 2's class path and registered into the Sqoop 2 repository
+
+
+
 Sqoop 2 MapReduce Job Execution Lifecycle with Connector API
 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 
@@ -457,6 +557,8 @@ The diagram below decribes the reduce phase of a job.
         |                 |                  |            |                |            |       | write into Data Source
         |                 |                  |            |                |            |       |----------------------->
 
+More details can be found in `Sqoop MR Execution Engine`_
 
+.. _`Sqoop MR Execution Engine`: https://cwiki.apache.org/confluence/display/SQOOP/Sqoop+MR+Execution+Engine
 
 .. _`Intermediate Data Format representation`: https://cwiki.apache.org/confluence/display/SQOOP/Sqoop2+Intermediate+representation

http://git-wip-us.apache.org/repos/asf/sqoop/blob/e41bc6e3/docs/src/site/sphinx/index.rst
----------------------------------------------------------------------
diff --git a/docs/src/site/sphinx/index.rst b/docs/src/site/sphinx/index.rst
index 43ef215..666f3c3 100644
--- a/docs/src/site/sphinx/index.rst
+++ b/docs/src/site/sphinx/index.rst
@@ -57,7 +57,7 @@ If you are keen on contributing to Sqoop and get your hands dirty building conne
 
 - `Building Sqoop 2 <BuildingSqoop2.html>`_
 - `Sqoop Development Environment Setup <DevEnv.html>`_
-- `Developing a Sqoop Connector with Connection API <ConnectorDevelopment.html>`_
+- `Developing a Sqoop Connector with Connector API <ConnectorDevelopment.html>`_
 - `Developing Sqoop application with REST API <RESTAPI.html>`_
 - `Developing Sqoop application using Sqoop Java Client API <ClientAPI.html>`_