You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@sqoop.apache.org by ve...@apache.org on 2015/08/08 05:42:41 UTC

sqoop git commit: SQOOP-2333: Sqoop to support Custom options for User Defined Plugins(Tool)

Repository: sqoop
Updated Branches:
  refs/heads/trunk 1b32a53e6 -> 73ef0c133


SQOOP-2333: Sqoop to support Custom options for User Defined Plugins(Tool)

     (Rakesh Sharma via Venkat Ranganathan)


Project: http://git-wip-us.apache.org/repos/asf/sqoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/sqoop/commit/73ef0c13
Tree: http://git-wip-us.apache.org/repos/asf/sqoop/tree/73ef0c13
Diff: http://git-wip-us.apache.org/repos/asf/sqoop/diff/73ef0c13

Branch: refs/heads/trunk
Commit: 73ef0c133bb1be875c92fe42e9e13c58d77d8360
Parents: 1b32a53
Author: Venkat Ranganathan <ve...@hortonworks.com>
Authored: Fri Aug 7 20:42:11 2015 -0700
Committer: Venkat Ranganathan <ve...@hortonworks.com>
Committed: Fri Aug 7 20:42:11 2015 -0700

----------------------------------------------------------------------
 src/docs/dev/SqoopDevGuide.txt                  |   2 +-
 src/docs/dev/plugin-arch.txt                    | 184 +++++++++++++++++++
 src/java/org/apache/sqoop/SqoopOptions.java     |  20 ++
 .../org/apache/sqoop/util/SqoopJsonUtil.java    |  82 +++++++++
 .../apache/sqoop/util/TestSqoopJsonUtil.java    |  87 +++++++++
 5 files changed, 374 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/sqoop/blob/73ef0c13/src/docs/dev/SqoopDevGuide.txt
----------------------------------------------------------------------
diff --git a/src/docs/dev/SqoopDevGuide.txt b/src/docs/dev/SqoopDevGuide.txt
index a81cfda..8f9ba24 100644
--- a/src/docs/dev/SqoopDevGuide.txt
+++ b/src/docs/dev/SqoopDevGuide.txt
@@ -48,5 +48,5 @@ include::compiling.txt[]
 
 include::api-reference.txt[]
 
-
+include::plugin-arch.txt[]
 

http://git-wip-us.apache.org/repos/asf/sqoop/blob/73ef0c13/src/docs/dev/plugin-arch.txt
----------------------------------------------------------------------
diff --git a/src/docs/dev/plugin-arch.txt b/src/docs/dev/plugin-arch.txt
new file mode 100644
index 0000000..11e3c6e
--- /dev/null
+++ b/src/docs/dev/plugin-arch.txt
@@ -0,0 +1,184 @@
+
+////
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+////
+
+Developing Sqoop Plugins
+------------------------
+Sqoop allows users to develop their own plugins. Users can develop their
+plugins as separate jars, deploy them in $SQOOP_LIB and register with
+sqoop. Infact, Sqoop architecture is a plugin based architecture and all
+the internal tools like import, export, merge etc are also supported as
+tool plugins. Users can also develop their own custom tool plugins. Once
+deployed and registered with sqoop, these plugins will work like any
+other internal tool. They will also get listed in the tools when you run
++sqoop help+ command.
+
+BaseSqoopTool - Base class for User defined Tools
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+BaseSqoopTool is the base class for all Sqoop Tools. If you want to develop
+a cusom tool, you need to inherit your tool from BaseSqoopTool and override
+the following methods:
+
+- +public int run(SqoopOptions options)+ : This is the main method for the
+   tool and acts as entry point for execution for your custom tool.
+- +public void configureOptions(ToolOptions toolOptions)+ : Configures the
+   command-line arguments we expect to receive. You can also specify the
+   description of all the command line arguments. When a user executes
+   +sqoop help <your tool>+, the information which is provided in this
+   method will be output to the user.
+- +public void applyOptions(CommandLine in, SqoopOptions out)+ : parses all
+   options and populates SqoopOptions which acts as a data transfer object
+   during the complete execution.
+- +public void validateOptions(SqoopOptions options)+ : provide any
+   validations required for your options.
+
+Supporting User defined custom options
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Sqoop parses the arguments which are passed by users and are stored in
+SqoopOptions object. This object then acts as data transfer object. This
+object is passed to various phases of processing like preprocessing before
+running the actual MapReduce, MapReduce phase and even postprocessing phase.
+This class has a lot of members. The options are parsed and populated in the
+respective member. Now lets say that a user creates a new user defined tool
+and this tool has some new options which don't map to any of the existing
+members of the SqoopOptions class. Either user can add a new member to
+SqoopOption class which means users will have to make changes in sqoop and
+compile it, which mght not be possible always for all users. Other option
+is to use +extraArgs+ member. This is a string array which contains the
+options for thirdparty tools which could be passed directly to the third
+party tool like mysqldump etc. This array string needs parsing every time
+to understand the parameters.
+The most elegant way of supporting custom options for user defined tool is
++customToolOptions+ map. This is a map member of SqoopOption class.
+Developer can parse the user defined parameters and populate this map with
+appropriate key/value pairs. When SqoopOption object is passed to various
+phases of processing these values will be readily available and parsing is
+not required for every access.
+Lets take an example to understand the usage better. Lets say you want to
+develop a custom tool to merge two hive tables and it will take the following
+parameters :
+
+- +--hive-updates-database+
+- +--hive-updates-table+
+- +--merge-keys+
+- +--retain-updates-tbl+
+
+None of these options are available in SqoopOption object. Tool Developer
+can override the +applyOptions+ method and in this method the user options
+can be parsed and populated in the customToolOptions map. Once that is done,
+SqoopOption object can be passed throughout program and these values will
+be available for users.
+
+These option names will be stored as keys and the values passed by users
+will be stored as values. Lets define these options as static finals :
+....
+  public static final String MERGE_KEYS = "merge-keys";
+  public static final String HIVE_UPDATES_TABLE = "hive-updates-table";
+  public static final String HIVE_UPDATES_TABLE_DB = "hive-updates-database";
+  public static final String RETAIN_UPDATES_TBL = "retain-updates-tbl";
+....
+
+A sample applyOptions example which parses the above said options and
+populates the customToolOptions map is below :
+....
+ public void applyOptions(CommandLine in, SqoopOptions out)
+    throws InvalidOptionsException {
+
+    if (in.hasOption(VERBOSE_ARG)) {
+      LoggingUtils.setDebugLevel();
+      log.debug("Enabled debug logging.");
+    }
+
+    if (in.hasOption(HELP_ARG)) {
+      ToolOptions toolOpts = new ToolOptions();
+      configureOptions(toolOpts);
+      printHelp(toolOpts);
+      throw new InvalidOptionsException("");
+    }
+
+    Map<String, String> mergeOptionsMap = new HashMap<String, String>();
+    if (in.hasOption(MERGE_KEYS)) {
+      mergeOptionsMap.put(MERGE_KEYS, in.getOptionValue(MERGE_KEYS));
+    }
+
+    if (in.hasOption(HIVE_UPDATES_TABLE)) {
+      mergeOptionsMap.put(HIVE_UPDATES_TABLE,
+        in.getOptionValue(HIVE_UPDATES_TABLE));
+    }
+
+    if (in.hasOption(HIVE_UPDATES_TABLE_DB)) {
+      mergeOptionsMap.put(HIVE_UPDATES_TABLE_DB,
+        in.getOptionValue(HIVE_UPDATES_TABLE_DB));
+    }
+
+    if (in.hasOption(RETAIN_UPDATES_TBL)) {
+      mergeOptionsMap.put(RETAIN_UPDATES_TBL, "");
+    }
+
+    if (in.hasOption(HIVE_TABLE_ARG)) {
+      out.setHiveTableName(in.getOptionValue(HIVE_TABLE_ARG));
+    }
+
+    if (in.hasOption(HIVE_DATABASE_ARG)) {
+      out.setHiveDatabaseName(in.getOptionValue(HIVE_DATABASE_ARG));
+    }
+
+    if (out.getCustomToolOptions() == null) {
+      out.setCustomToolOptions(mergeOptionsMap);
+    }
+  }
+....
+
+
+ToolPlugin - Base class for the plugin
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Once the tool is developed, you need to wrap it with a plugin class and
+register that plugin class with Sqoop. Your plugin class should extend from
++org.apache.sqoop.tool.ToolPlugin+  and override +getTools()+ method.
+Example: Lets say that you have developed a tool called hive-merge which
+merges 2 hive tables and your Tool class is HiveMergeTool, the plugin
+implementation will look like
+....
+public class HiveMergePlugin extends ToolPlugin {
+
+  @Override
+  public List<ToolDesc> getTools() {
+    return Collections
+      .singletonList(new ToolDesc(
+        "hive-merge",
+        HiveMergeTool.class,
+        "This tool is used to perform the merge data from a tmp hive table into a destination hive table."));
+  }
+
+}
+....
+
+Registering User defined plugin with Sqoop
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Finally you need to copy your plugin jar to $SQOOP_LIB directory and register the
+plugin class with sqoop in sqoop-site.xml :
+....
+<property>
+    <name>sqoop.tool.plugins</name>
+    <value>com.expedia.sqoop.tool.HiveMergePlugin</value>
+    <description>A comma-delimited list of ToolPlugin implementations
+      which are consulted, in order, to register SqoopTool instances which
+      allow third-party tools to be used.
+    </description>
+  </property>
+....

http://git-wip-us.apache.org/repos/asf/sqoop/blob/73ef0c13/src/java/org/apache/sqoop/SqoopOptions.java
----------------------------------------------------------------------
diff --git a/src/java/org/apache/sqoop/SqoopOptions.java b/src/java/org/apache/sqoop/SqoopOptions.java
index 9405605..ace90fd 100644
--- a/src/java/org/apache/sqoop/SqoopOptions.java
+++ b/src/java/org/apache/sqoop/SqoopOptions.java
@@ -35,6 +35,7 @@ import org.apache.hadoop.conf.Configuration;
 import org.apache.sqoop.accumulo.AccumuloConstants;
 import org.apache.sqoop.util.CredentialsUtil;
 import org.apache.sqoop.util.LoggingUtils;
+import org.apache.sqoop.util.SqoopJsonUtil;
 import org.apache.sqoop.util.password.CredentialProviderHelper;
 import org.apache.sqoop.validation.AbortOnFailureHandler;
 import org.apache.sqoop.validation.AbsoluteValidationThreshold;
@@ -91,6 +92,9 @@ public class SqoopOptions implements Cloneable {
     }
   }
 
+  @StoredAsProperty("customtool.options.jsonmap")
+  private Map<String, String> customToolOptions;
+
   // TODO(aaron): Adding something here? Add a setter and a getter.  Add a
   // default value in initDefaults() if you need one.  If this value needs to
   // be serialized in the metastore, it should be marked with
@@ -613,6 +617,9 @@ public class SqoopOptions implements Cloneable {
           } else if (typ.isEnum()) {
             f.set(this, Enum.valueOf(typ,
                 props.getProperty(propName, f.get(this).toString())));
+          }  else if (typ.equals(Map.class)) {
+            f.set(this,
+                SqoopJsonUtil.getMapforJsonString(props.getProperty(propName)));
           } else {
             throw new RuntimeException("Could not retrieve property "
                 + propName + " for type: " + typ);
@@ -731,6 +738,11 @@ public class SqoopOptions implements Cloneable {
                 f.get(this) == null ? "null" : f.get(this).toString());
           } else if (typ.isEnum()) {
             putProperty(props, propName, f.get(this).toString());
+          } else if (typ.equals(Map.class)) {
+            putProperty(
+                props,
+                propName,
+                SqoopJsonUtil.getJsonStringforMap((Map) f.get(this)));
           } else {
             throw new RuntimeException("Could not set property "
                 + propName + " for type: " + typ);
@@ -2564,4 +2576,12 @@ public class SqoopOptions implements Cloneable {
   public void setHCatalogPartitionValues(String hpvs) {
     this.hCatalogPartitionValues = hpvs;
   }
+
+  public Map<String, String> getCustomToolOptions() {
+    return customToolOptions;
+  }
+
+  public void setCustomToolOptions(Map<String, String> customToolOptions) {
+    this.customToolOptions = customToolOptions;
+  }
 }

http://git-wip-us.apache.org/repos/asf/sqoop/blob/73ef0c13/src/java/org/apache/sqoop/util/SqoopJsonUtil.java
----------------------------------------------------------------------
diff --git a/src/java/org/apache/sqoop/util/SqoopJsonUtil.java b/src/java/org/apache/sqoop/util/SqoopJsonUtil.java
new file mode 100644
index 0000000..c1b40cd
--- /dev/null
+++ b/src/java/org/apache/sqoop/util/SqoopJsonUtil.java
@@ -0,0 +1,82 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.sqoop.util;
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.codehaus.jackson.JsonParseException;
+import org.codehaus.jackson.map.JsonMappingException;
+import org.codehaus.jackson.map.ObjectMapper;
+import org.codehaus.jackson.type.TypeReference;
+import org.json.JSONObject;
+
+public class SqoopJsonUtil {
+
+  public static final Log LOG = LogFactory
+    .getLog(SqoopJsonUtil.class.getName());
+
+  public static String getJsonStringforMap(Map<String, String> map) {
+    JSONObject pathPartMap = new JSONObject(map);
+    return pathPartMap.toString();
+  }
+
+  public static Map<String, String> getMapforJsonString(String mapJsonStr) {
+    if ("".equals(mapJsonStr) || null == mapJsonStr) {
+      throw new IllegalArgumentException("Passed Null for map " + mapJsonStr);
+    }
+
+    LOG.debug("Passed mapJsonStr ::" + mapJsonStr + " to parse");
+    Map<String, String> partPathMap = new HashMap<String, String>();
+    ObjectMapper mapper = new ObjectMapper();
+    try {
+      partPathMap = mapper.readValue(mapJsonStr,
+        new TypeReference<HashMap<String, String>>() {
+        });
+      return partPathMap;
+    } catch (JsonParseException e) {
+      LOG.error("JsonParseException:: Illegal json to parse into map :"
+        + mapJsonStr + e.getMessage());
+      throw new IllegalArgumentException("Illegal json to parse into map :"
+        + mapJsonStr + e.getMessage(), e);
+    } catch (JsonMappingException e) {
+      LOG.error("JsonMappingException:: Illegal json to parse into map :"
+        + mapJsonStr + e.getMessage());
+      throw new IllegalArgumentException("Illegal json to parse into map :"
+        + mapJsonStr + e.getMessage(), e);
+    } catch (IOException e) {
+      LOG.error("IOException while parsing json into map :" + mapJsonStr
+        + e.getMessage());
+      throw new IllegalArgumentException(
+        "IOException while parsing json into map :" + mapJsonStr
+          + e.getMessage(), e);
+    }
+  }
+
+  public static boolean isEmptyJSON(String jsonStr) {
+    if (null == jsonStr || "".equals(jsonStr) || "{}".equals(jsonStr)) {
+      return true;
+    } else {
+      return false;
+    }
+  }
+}
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/sqoop/blob/73ef0c13/src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java
----------------------------------------------------------------------
diff --git a/src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java b/src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java
new file mode 100644
index 0000000..4cbc8a1
--- /dev/null
+++ b/src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java
@@ -0,0 +1,87 @@
+package org.apache.sqoop.util;
+
+import static org.junit.Assert.assertEquals;
+
+import java.util.HashMap;
+import java.util.Map;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class TestSqoopJsonUtil {
+
+  private SqoopJsonUtil jsonUtil;
+  private static Map<String, String> paramMap;
+  private static String jsonStr;
+
+  @BeforeClass
+  public static void setup() {
+    paramMap = new HashMap<String, String>();
+    paramMap.put("k1", "v1");
+    paramMap.put("k2", "v2");
+    paramMap.put("k3", "v3");
+
+    jsonStr = "{\"k3\":\"v3\",\"k1\":\"v1\",\"k2\":\"v2\"}";
+
+  }
+
+  @Test
+  public void testGetJsonStringFromMap() {
+    String resultJsonStr = SqoopJsonUtil.getJsonStringforMap(paramMap);
+    assertEquals(jsonStr, resultJsonStr);
+  }
+
+  @Test
+  public void testGetJsonStringFromMapNullMap() {
+    Map<String, String> nullMap = null;
+    String resultJsonStr = SqoopJsonUtil.getJsonStringforMap(nullMap);
+    assertEquals("{}", resultJsonStr);
+  }
+
+  @Test
+  public void testGetJsonStringFromMapEmptyMap() {
+    Map<String, String> nullMap = new HashMap<String, String>();
+    String resultJsonStr = SqoopJsonUtil.getJsonStringforMap(nullMap);
+    assertEquals("{}", resultJsonStr);
+  }
+
+  @Test
+  public void testGetMapforJsonString() {
+    Map<String, String> resultMap = SqoopJsonUtil.getMapforJsonString(jsonStr);
+    assertEquals(paramMap, resultMap);
+  }
+
+  @Test(expected = IllegalArgumentException.class)
+  public void testGetMapforJsonStringNullString() {
+    Map<String, String> resultMap = SqoopJsonUtil.getMapforJsonString(null);
+  }
+
+  @Test(expected = IllegalArgumentException.class)
+  public void testGetMapforJsonStringEmptyString() {
+    Map<String, String> resultMap = SqoopJsonUtil.getMapforJsonString("");
+  }
+
+  @Test
+  public void testEmptyJSON() {
+    String jsonStr = null;
+    boolean isEmpty;
+    isEmpty = SqoopJsonUtil.isEmptyJSON(jsonStr);
+    assertEquals(true, isEmpty);
+
+    jsonStr = "";
+    isEmpty = SqoopJsonUtil.isEmptyJSON(jsonStr);
+    assertEquals(true, isEmpty);
+
+    jsonStr = "{}";
+    isEmpty = SqoopJsonUtil.isEmptyJSON(jsonStr);
+    assertEquals(true, isEmpty);
+
+  }
+
+  @Test
+  public void testNonEmptyJSON() {
+    boolean isEmpty = SqoopJsonUtil.isEmptyJSON(jsonStr);
+    assertEquals(false, isEmpty);
+  }
+
+}