You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@sqoop.apache.org by ve...@apache.org on 2015/08/08 05:42:41 UTC
sqoop git commit: SQOOP-2333: Sqoop to support Custom options for
User Defined Plugins(Tool)
Repository: sqoop
Updated Branches:
refs/heads/trunk 1b32a53e6 -> 73ef0c133
SQOOP-2333: Sqoop to support Custom options for User Defined Plugins(Tool)
(Rakesh Sharma via Venkat Ranganathan)
Project: http://git-wip-us.apache.org/repos/asf/sqoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/sqoop/commit/73ef0c13
Tree: http://git-wip-us.apache.org/repos/asf/sqoop/tree/73ef0c13
Diff: http://git-wip-us.apache.org/repos/asf/sqoop/diff/73ef0c13
Branch: refs/heads/trunk
Commit: 73ef0c133bb1be875c92fe42e9e13c58d77d8360
Parents: 1b32a53
Author: Venkat Ranganathan <ve...@hortonworks.com>
Authored: Fri Aug 7 20:42:11 2015 -0700
Committer: Venkat Ranganathan <ve...@hortonworks.com>
Committed: Fri Aug 7 20:42:11 2015 -0700
----------------------------------------------------------------------
src/docs/dev/SqoopDevGuide.txt | 2 +-
src/docs/dev/plugin-arch.txt | 184 +++++++++++++++++++
src/java/org/apache/sqoop/SqoopOptions.java | 20 ++
.../org/apache/sqoop/util/SqoopJsonUtil.java | 82 +++++++++
.../apache/sqoop/util/TestSqoopJsonUtil.java | 87 +++++++++
5 files changed, 374 insertions(+), 1 deletion(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/sqoop/blob/73ef0c13/src/docs/dev/SqoopDevGuide.txt
----------------------------------------------------------------------
diff --git a/src/docs/dev/SqoopDevGuide.txt b/src/docs/dev/SqoopDevGuide.txt
index a81cfda..8f9ba24 100644
--- a/src/docs/dev/SqoopDevGuide.txt
+++ b/src/docs/dev/SqoopDevGuide.txt
@@ -48,5 +48,5 @@ include::compiling.txt[]
include::api-reference.txt[]
-
+include::plugin-arch.txt[]
http://git-wip-us.apache.org/repos/asf/sqoop/blob/73ef0c13/src/docs/dev/plugin-arch.txt
----------------------------------------------------------------------
diff --git a/src/docs/dev/plugin-arch.txt b/src/docs/dev/plugin-arch.txt
new file mode 100644
index 0000000..11e3c6e
--- /dev/null
+++ b/src/docs/dev/plugin-arch.txt
@@ -0,0 +1,184 @@
+
+////
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+////
+
+Developing Sqoop Plugins
+------------------------
+Sqoop allows users to develop their own plugins. Users can develop their
+plugins as separate jars, deploy them in $SQOOP_LIB and register with
+sqoop. Infact, Sqoop architecture is a plugin based architecture and all
+the internal tools like import, export, merge etc are also supported as
+tool plugins. Users can also develop their own custom tool plugins. Once
+deployed and registered with sqoop, these plugins will work like any
+other internal tool. They will also get listed in the tools when you run
++sqoop help+ command.
+
+BaseSqoopTool - Base class for User defined Tools
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+BaseSqoopTool is the base class for all Sqoop Tools. If you want to develop
+a cusom tool, you need to inherit your tool from BaseSqoopTool and override
+the following methods:
+
+- +public int run(SqoopOptions options)+ : This is the main method for the
+ tool and acts as entry point for execution for your custom tool.
+- +public void configureOptions(ToolOptions toolOptions)+ : Configures the
+ command-line arguments we expect to receive. You can also specify the
+ description of all the command line arguments. When a user executes
+ +sqoop help <your tool>+, the information which is provided in this
+ method will be output to the user.
+- +public void applyOptions(CommandLine in, SqoopOptions out)+ : parses all
+ options and populates SqoopOptions which acts as a data transfer object
+ during the complete execution.
+- +public void validateOptions(SqoopOptions options)+ : provide any
+ validations required for your options.
+
+Supporting User defined custom options
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Sqoop parses the arguments which are passed by users and are stored in
+SqoopOptions object. This object then acts as data transfer object. This
+object is passed to various phases of processing like preprocessing before
+running the actual MapReduce, MapReduce phase and even postprocessing phase.
+This class has a lot of members. The options are parsed and populated in the
+respective member. Now lets say that a user creates a new user defined tool
+and this tool has some new options which don't map to any of the existing
+members of the SqoopOptions class. Either user can add a new member to
+SqoopOption class which means users will have to make changes in sqoop and
+compile it, which mght not be possible always for all users. Other option
+is to use +extraArgs+ member. This is a string array which contains the
+options for thirdparty tools which could be passed directly to the third
+party tool like mysqldump etc. This array string needs parsing every time
+to understand the parameters.
+The most elegant way of supporting custom options for user defined tool is
++customToolOptions+ map. This is a map member of SqoopOption class.
+Developer can parse the user defined parameters and populate this map with
+appropriate key/value pairs. When SqoopOption object is passed to various
+phases of processing these values will be readily available and parsing is
+not required for every access.
+Lets take an example to understand the usage better. Lets say you want to
+develop a custom tool to merge two hive tables and it will take the following
+parameters :
+
+- +--hive-updates-database+
+- +--hive-updates-table+
+- +--merge-keys+
+- +--retain-updates-tbl+
+
+None of these options are available in SqoopOption object. Tool Developer
+can override the +applyOptions+ method and in this method the user options
+can be parsed and populated in the customToolOptions map. Once that is done,
+SqoopOption object can be passed throughout program and these values will
+be available for users.
+
+These option names will be stored as keys and the values passed by users
+will be stored as values. Lets define these options as static finals :
+....
+ public static final String MERGE_KEYS = "merge-keys";
+ public static final String HIVE_UPDATES_TABLE = "hive-updates-table";
+ public static final String HIVE_UPDATES_TABLE_DB = "hive-updates-database";
+ public static final String RETAIN_UPDATES_TBL = "retain-updates-tbl";
+....
+
+A sample applyOptions example which parses the above said options and
+populates the customToolOptions map is below :
+....
+ public void applyOptions(CommandLine in, SqoopOptions out)
+ throws InvalidOptionsException {
+
+ if (in.hasOption(VERBOSE_ARG)) {
+ LoggingUtils.setDebugLevel();
+ log.debug("Enabled debug logging.");
+ }
+
+ if (in.hasOption(HELP_ARG)) {
+ ToolOptions toolOpts = new ToolOptions();
+ configureOptions(toolOpts);
+ printHelp(toolOpts);
+ throw new InvalidOptionsException("");
+ }
+
+ Map<String, String> mergeOptionsMap = new HashMap<String, String>();
+ if (in.hasOption(MERGE_KEYS)) {
+ mergeOptionsMap.put(MERGE_KEYS, in.getOptionValue(MERGE_KEYS));
+ }
+
+ if (in.hasOption(HIVE_UPDATES_TABLE)) {
+ mergeOptionsMap.put(HIVE_UPDATES_TABLE,
+ in.getOptionValue(HIVE_UPDATES_TABLE));
+ }
+
+ if (in.hasOption(HIVE_UPDATES_TABLE_DB)) {
+ mergeOptionsMap.put(HIVE_UPDATES_TABLE_DB,
+ in.getOptionValue(HIVE_UPDATES_TABLE_DB));
+ }
+
+ if (in.hasOption(RETAIN_UPDATES_TBL)) {
+ mergeOptionsMap.put(RETAIN_UPDATES_TBL, "");
+ }
+
+ if (in.hasOption(HIVE_TABLE_ARG)) {
+ out.setHiveTableName(in.getOptionValue(HIVE_TABLE_ARG));
+ }
+
+ if (in.hasOption(HIVE_DATABASE_ARG)) {
+ out.setHiveDatabaseName(in.getOptionValue(HIVE_DATABASE_ARG));
+ }
+
+ if (out.getCustomToolOptions() == null) {
+ out.setCustomToolOptions(mergeOptionsMap);
+ }
+ }
+....
+
+
+ToolPlugin - Base class for the plugin
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Once the tool is developed, you need to wrap it with a plugin class and
+register that plugin class with Sqoop. Your plugin class should extend from
++org.apache.sqoop.tool.ToolPlugin+ and override +getTools()+ method.
+Example: Lets say that you have developed a tool called hive-merge which
+merges 2 hive tables and your Tool class is HiveMergeTool, the plugin
+implementation will look like
+....
+public class HiveMergePlugin extends ToolPlugin {
+
+ @Override
+ public List<ToolDesc> getTools() {
+ return Collections
+ .singletonList(new ToolDesc(
+ "hive-merge",
+ HiveMergeTool.class,
+ "This tool is used to perform the merge data from a tmp hive table into a destination hive table."));
+ }
+
+}
+....
+
+Registering User defined plugin with Sqoop
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Finally you need to copy your plugin jar to $SQOOP_LIB directory and register the
+plugin class with sqoop in sqoop-site.xml :
+....
+<property>
+ <name>sqoop.tool.plugins</name>
+ <value>com.expedia.sqoop.tool.HiveMergePlugin</value>
+ <description>A comma-delimited list of ToolPlugin implementations
+ which are consulted, in order, to register SqoopTool instances which
+ allow third-party tools to be used.
+ </description>
+ </property>
+....
http://git-wip-us.apache.org/repos/asf/sqoop/blob/73ef0c13/src/java/org/apache/sqoop/SqoopOptions.java
----------------------------------------------------------------------
diff --git a/src/java/org/apache/sqoop/SqoopOptions.java b/src/java/org/apache/sqoop/SqoopOptions.java
index 9405605..ace90fd 100644
--- a/src/java/org/apache/sqoop/SqoopOptions.java
+++ b/src/java/org/apache/sqoop/SqoopOptions.java
@@ -35,6 +35,7 @@ import org.apache.hadoop.conf.Configuration;
import org.apache.sqoop.accumulo.AccumuloConstants;
import org.apache.sqoop.util.CredentialsUtil;
import org.apache.sqoop.util.LoggingUtils;
+import org.apache.sqoop.util.SqoopJsonUtil;
import org.apache.sqoop.util.password.CredentialProviderHelper;
import org.apache.sqoop.validation.AbortOnFailureHandler;
import org.apache.sqoop.validation.AbsoluteValidationThreshold;
@@ -91,6 +92,9 @@ public class SqoopOptions implements Cloneable {
}
}
+ @StoredAsProperty("customtool.options.jsonmap")
+ private Map<String, String> customToolOptions;
+
// TODO(aaron): Adding something here? Add a setter and a getter. Add a
// default value in initDefaults() if you need one. If this value needs to
// be serialized in the metastore, it should be marked with
@@ -613,6 +617,9 @@ public class SqoopOptions implements Cloneable {
} else if (typ.isEnum()) {
f.set(this, Enum.valueOf(typ,
props.getProperty(propName, f.get(this).toString())));
+ } else if (typ.equals(Map.class)) {
+ f.set(this,
+ SqoopJsonUtil.getMapforJsonString(props.getProperty(propName)));
} else {
throw new RuntimeException("Could not retrieve property "
+ propName + " for type: " + typ);
@@ -731,6 +738,11 @@ public class SqoopOptions implements Cloneable {
f.get(this) == null ? "null" : f.get(this).toString());
} else if (typ.isEnum()) {
putProperty(props, propName, f.get(this).toString());
+ } else if (typ.equals(Map.class)) {
+ putProperty(
+ props,
+ propName,
+ SqoopJsonUtil.getJsonStringforMap((Map) f.get(this)));
} else {
throw new RuntimeException("Could not set property "
+ propName + " for type: " + typ);
@@ -2564,4 +2576,12 @@ public class SqoopOptions implements Cloneable {
public void setHCatalogPartitionValues(String hpvs) {
this.hCatalogPartitionValues = hpvs;
}
+
+ public Map<String, String> getCustomToolOptions() {
+ return customToolOptions;
+ }
+
+ public void setCustomToolOptions(Map<String, String> customToolOptions) {
+ this.customToolOptions = customToolOptions;
+ }
}
http://git-wip-us.apache.org/repos/asf/sqoop/blob/73ef0c13/src/java/org/apache/sqoop/util/SqoopJsonUtil.java
----------------------------------------------------------------------
diff --git a/src/java/org/apache/sqoop/util/SqoopJsonUtil.java b/src/java/org/apache/sqoop/util/SqoopJsonUtil.java
new file mode 100644
index 0000000..c1b40cd
--- /dev/null
+++ b/src/java/org/apache/sqoop/util/SqoopJsonUtil.java
@@ -0,0 +1,82 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.sqoop.util;
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.codehaus.jackson.JsonParseException;
+import org.codehaus.jackson.map.JsonMappingException;
+import org.codehaus.jackson.map.ObjectMapper;
+import org.codehaus.jackson.type.TypeReference;
+import org.json.JSONObject;
+
+public class SqoopJsonUtil {
+
+ public static final Log LOG = LogFactory
+ .getLog(SqoopJsonUtil.class.getName());
+
+ public static String getJsonStringforMap(Map<String, String> map) {
+ JSONObject pathPartMap = new JSONObject(map);
+ return pathPartMap.toString();
+ }
+
+ public static Map<String, String> getMapforJsonString(String mapJsonStr) {
+ if ("".equals(mapJsonStr) || null == mapJsonStr) {
+ throw new IllegalArgumentException("Passed Null for map " + mapJsonStr);
+ }
+
+ LOG.debug("Passed mapJsonStr ::" + mapJsonStr + " to parse");
+ Map<String, String> partPathMap = new HashMap<String, String>();
+ ObjectMapper mapper = new ObjectMapper();
+ try {
+ partPathMap = mapper.readValue(mapJsonStr,
+ new TypeReference<HashMap<String, String>>() {
+ });
+ return partPathMap;
+ } catch (JsonParseException e) {
+ LOG.error("JsonParseException:: Illegal json to parse into map :"
+ + mapJsonStr + e.getMessage());
+ throw new IllegalArgumentException("Illegal json to parse into map :"
+ + mapJsonStr + e.getMessage(), e);
+ } catch (JsonMappingException e) {
+ LOG.error("JsonMappingException:: Illegal json to parse into map :"
+ + mapJsonStr + e.getMessage());
+ throw new IllegalArgumentException("Illegal json to parse into map :"
+ + mapJsonStr + e.getMessage(), e);
+ } catch (IOException e) {
+ LOG.error("IOException while parsing json into map :" + mapJsonStr
+ + e.getMessage());
+ throw new IllegalArgumentException(
+ "IOException while parsing json into map :" + mapJsonStr
+ + e.getMessage(), e);
+ }
+ }
+
+ public static boolean isEmptyJSON(String jsonStr) {
+ if (null == jsonStr || "".equals(jsonStr) || "{}".equals(jsonStr)) {
+ return true;
+ } else {
+ return false;
+ }
+ }
+}
\ No newline at end of file
http://git-wip-us.apache.org/repos/asf/sqoop/blob/73ef0c13/src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java
----------------------------------------------------------------------
diff --git a/src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java b/src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java
new file mode 100644
index 0000000..4cbc8a1
--- /dev/null
+++ b/src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java
@@ -0,0 +1,87 @@
+package org.apache.sqoop.util;
+
+import static org.junit.Assert.assertEquals;
+
+import java.util.HashMap;
+import java.util.Map;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class TestSqoopJsonUtil {
+
+ private SqoopJsonUtil jsonUtil;
+ private static Map<String, String> paramMap;
+ private static String jsonStr;
+
+ @BeforeClass
+ public static void setup() {
+ paramMap = new HashMap<String, String>();
+ paramMap.put("k1", "v1");
+ paramMap.put("k2", "v2");
+ paramMap.put("k3", "v3");
+
+ jsonStr = "{\"k3\":\"v3\",\"k1\":\"v1\",\"k2\":\"v2\"}";
+
+ }
+
+ @Test
+ public void testGetJsonStringFromMap() {
+ String resultJsonStr = SqoopJsonUtil.getJsonStringforMap(paramMap);
+ assertEquals(jsonStr, resultJsonStr);
+ }
+
+ @Test
+ public void testGetJsonStringFromMapNullMap() {
+ Map<String, String> nullMap = null;
+ String resultJsonStr = SqoopJsonUtil.getJsonStringforMap(nullMap);
+ assertEquals("{}", resultJsonStr);
+ }
+
+ @Test
+ public void testGetJsonStringFromMapEmptyMap() {
+ Map<String, String> nullMap = new HashMap<String, String>();
+ String resultJsonStr = SqoopJsonUtil.getJsonStringforMap(nullMap);
+ assertEquals("{}", resultJsonStr);
+ }
+
+ @Test
+ public void testGetMapforJsonString() {
+ Map<String, String> resultMap = SqoopJsonUtil.getMapforJsonString(jsonStr);
+ assertEquals(paramMap, resultMap);
+ }
+
+ @Test(expected = IllegalArgumentException.class)
+ public void testGetMapforJsonStringNullString() {
+ Map<String, String> resultMap = SqoopJsonUtil.getMapforJsonString(null);
+ }
+
+ @Test(expected = IllegalArgumentException.class)
+ public void testGetMapforJsonStringEmptyString() {
+ Map<String, String> resultMap = SqoopJsonUtil.getMapforJsonString("");
+ }
+
+ @Test
+ public void testEmptyJSON() {
+ String jsonStr = null;
+ boolean isEmpty;
+ isEmpty = SqoopJsonUtil.isEmptyJSON(jsonStr);
+ assertEquals(true, isEmpty);
+
+ jsonStr = "";
+ isEmpty = SqoopJsonUtil.isEmptyJSON(jsonStr);
+ assertEquals(true, isEmpty);
+
+ jsonStr = "{}";
+ isEmpty = SqoopJsonUtil.isEmptyJSON(jsonStr);
+ assertEquals(true, isEmpty);
+
+ }
+
+ @Test
+ public void testNonEmptyJSON() {
+ boolean isEmpty = SqoopJsonUtil.isEmptyJSON(jsonStr);
+ assertEquals(false, isEmpty);
+ }
+
+}